Smarter watching

Right now we just use filesystem event monitoring to copy files over. This isn't sufficient, though. Files get missed! It's possible to start softcopy halfway into a source array's production, and miss a bunch of files. To deal with this, we perform an integrity check once the observers stop: we enumerate over all the file names we expect, and queue the ones that are missing. This is extremely slow, and does a lot of unnecessary work.

Ideally, we would have some way to concretely know "have we copied this file yet? What are we missing?" - but this is difficult due to the number of files (~30M in some cases). There are a few fixes to this, we can use PackedName to do memory compression, we can use an on-disk queue, we could use zarr 3 sharding in the future. But I think the most elegant is probably to use a combination of in-memory compression (packedname) and exploiting the "copy head" - we know timepoints are written sequentially, and so the files which aren't yet copied will always be in either the currently stalling timepoint or a timepoint close behind. This way, we can store a timepoint index and a list of files that *haven't* been copied yet. Every time we start a timepoint, we create a list of expected files. When we get a file, we take it off that list. When the timepoint advances, we take the elements that remain in that list and add them to our tardy queue. That way, we end up with a list of files that we know we never got a filesystem event about, and when there's downtime, we can check on them (or just save them until the integrity check at the end).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Smarter watching #8

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Smarter watching #8

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions