Add fill_multi_label: efficient multi-label void filling#13
Closed
donglaiw wants to merge 1 commit into
Closed
Conversation
Implements the multi-label algorithm described in README using the region adjacency graph of BG components + foreground labels, rather than the naive per-label binary_fill_holes loop. Expensive voxel-scale passes (cc3d connected components, axis-shift adjacency extraction) run once on the whole volume; reasoning then happens on a graph whose size scales with #labels + #voids, not voxel count. Semantics: a void is filled with the outermost enclosing adjacent label (shortest label-chain to exterior); ties (in-between shells) are left unfilled; inner-label islands are handled correctly; on an effectively binary input the result matches fill_voids.fill. Typical speedup vs naive per-label loop is 10x-50x on volumes with many labels, and the runtime does not scale with label count. Added 14 tests covering constructed semantic cases, dtype handling, in_place behavior, binary equivalence, and a perf regression guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Hi Donglai! Thanks for this contribution! I should point out that I already implemented a sophisticated multi-label fill in seung-lab/fastmorph#10 Fill voids is an older package with limited dependencies that I intended (in my own brain) to keep for binary images exclusively for simplicity. Would you mind taking a look at fastmorph and seeing if it has the features you need? |
Contributor
|
Going to close this due to perceived redundancy with fastmorph. If you see something I missed please reopen! Thanks for contributing! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the multi-label algorithm described in
README.md("Multi-Label Concept") as a new top-level function,fill_voids.fill_multi_label. Instead of contour tracing, the implementation works on the region adjacency graph (RAG) of the volume, which produces the same result far more efficiently: the expensive voxel-scale passes (connected components, adjacency extraction) run once, and reasoning then happens on a small graph whose size scales with #labels + #voids rather than with voxel count.What it does
Semantics:
0) is filled with labelLiffLis the outermost enclosing label of the void it belongs to --- i.e. the adjacent foreground label whose chain of label-to-label adjacencies back to the image exterior is shortest.Binside an outer shellAfills withAand leavesBintact.fill_voids.fill.How it works
cc3d.connected_componentson the background mask labels each void.cc3d.region_graphon the foreground labels gives the label-to-label adjacency graph (cheap: it has at most #labels nodes).level= shortest label-chain distance to the exterior. Labels that don't appear in any enclosure chain getlevel = inf.levellabel (else leave unfilled).labels[lut[bg_cc] != 0] = lut[bg_cc][...]pass applies every fill.Total work is O(N) for the voxel-scale steps + O(|label_graph|) for the dominator-style analysis. Crucially, the runtime does not scale with the number of labels, unlike the naive
for L in np.unique(labels): binary_fill_holes(labels == L)loop which is O(K·N).Performance
Measured on
(200, 200, 200)volumes:Tests
automated_test.pygains 14 new tests undertest_multi_label_*:fill_voids.fillwhen only one foreground label is present.in_placebehavior (true mutates, false preserves).fill_multi_labelmust be ≥3x faster than the naive oracle (it is typically 10x-50x on a laptop).New dependency
connected-components-3d(cc3d) is imported at call time only; the binaryfill_voids.fillis unaffected ifcc3dis unavailable. Happy to movecc3dintoinstall_requiresorextras_requireif that matches your preference.Test plan
python -m pytest automated_test.py -k "multi_label or dimension or zero or dtype or return or 2d_3d_differ"-> 34 passed locallytest_multi_label_matches_binary_fillverifies parity withfill_voids.fillon a non-trivial 3D input🤖 Generated with Claude Code