Add fill_multi_label: efficient multi-label void filling by donglaiw · Pull Request #13 · seung-lab/fill_voids

donglaiw · 2026-04-17T03:45:17Z

Summary

Implements the multi-label algorithm described in README.md ("Multi-Label Concept") as a new top-level function, fill_voids.fill_multi_label. Instead of contour tracing, the implementation works on the region adjacency graph (RAG) of the volume, which produces the same result far more efficiently: the expensive voxel-scale passes (connected components, adjacency extraction) run once, and reasoning then happens on a small graph whose size scales with #labels + #voids rather than with voxel count.

What it does

import fill_voids

filled = fill_voids.fill_multi_label(labels)                      # 2D or 3D
filled, n = fill_voids.fill_multi_label(labels, return_fill_count=True)
fill_voids.fill_multi_label(labels, in_place=True)                # save memory

Semantics:

A background voxel (value 0) is filled with label L iff L is the outermost enclosing label of the void it belongs to --- i.e. the adjacent foreground label whose chain of label-to-label adjacencies back to the image exterior is shortest.
If two or more adjacent labels tie for outermost (the void sits between distinct shells), the void is left unfilled. This matches step 4 of the README's multi-label concept.
Inner-label islands are handled correctly (README step 6): a void with a small island of label B inside an outer shell A fills with A and leaves B intact.
On an effectively binary input, the result equals fill_voids.fill.

How it works

cc3d.connected_components on the background mask labels each void.
A single vectorized axis-shift scan collects every (void-component, foreground-label) adjacency pair.
Faces of the image are scanned to identify exterior-touching voids and labels.
cc3d.region_graph on the foreground labels gives the label-to-label adjacency graph (cheap: it has at most #labels nodes).
A BFS from the "exterior" nodes through that label graph assigns each foreground label a level = shortest label-chain distance to the exterior. Labels that don't appear in any enclosure chain get level = inf.
For each interior void component: one adjacent label → fill with it; multiple adjacent → fill with the unique minimum-level label (else leave unfilled).
A single labels[lut[bg_cc] != 0] = lut[bg_cc][...] pass applies every fill.

Total work is O(N) for the voxel-scale steps + O(|label_graph|) for the dominator-style analysis. Crucially, the runtime does not scale with the number of labels, unlike the naive for L in np.unique(labels): binary_fill_holes(labels == L) loop which is O(K·N).

Performance

Measured on (200, 200, 200) volumes:

Image	fill_multi_label	naive per-label
99 dense labels, 20 isolated voids	1.0 s	47.7 s
50 labels, 2% sparse speckle BG	3.0 s	23.9 s

Tests

automated_test.py gains 14 new tests under test_multi_label_*:

Constructed semantic cases: simple donut (2D), inner island (2D/3D), nested shells (2D/3D), gap between labels, two separate voids within one label.
Binary-equivalence: result matches fill_voids.fill when only one foreground label is present.
dtype preservation across int8..uint64.
in_place behavior (true mutates, false preserves).
Empty / all-zero / no-BG edge cases.
Naive-oracle agreement on well-posed 2D and 3D images whose voids are all uniquely enclosed.
Perf regression guard: on a 120³ 64-cube volume, fill_multi_label must be ≥3x faster than the naive oracle (it is typically 10x-50x on a laptop).

New dependency

connected-components-3d (cc3d) is imported at call time only; the binary fill_voids.fill is unaffected if cc3d is unavailable. Happy to move cc3d into install_requires or extras_require if that matches your preference.

Test plan

python -m pytest automated_test.py -k "multi_label or dimension or zero or dtype or return or 2d_3d_differ" -> 34 passed locally
Binary regression: test_multi_label_matches_binary_fill verifies parity with fill_voids.fill on a non-trivial 3D input
Manual benchmark vs naive per-label loop (numbers above)

🤖 Generated with Claude Code

Implements the multi-label algorithm described in README using the region adjacency graph of BG components + foreground labels, rather than the naive per-label binary_fill_holes loop. Expensive voxel-scale passes (cc3d connected components, axis-shift adjacency extraction) run once on the whole volume; reasoning then happens on a graph whose size scales with #labels + #voids, not voxel count. Semantics: a void is filled with the outermost enclosing adjacent label (shortest label-chain to exterior); ties (in-between shells) are left unfilled; inner-label islands are handled correctly; on an effectively binary input the result matches fill_voids.fill. Typical speedup vs naive per-label loop is 10x-50x on volumes with many labels, and the runtime does not scale with label count. Added 14 tests covering constructed semantic cases, dtype handling, in_place behavior, binary equivalence, and a perf regression guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

william-silversmith · 2026-04-17T16:11:16Z

Hi Donglai!

Thanks for this contribution! I should point out that I already implemented a sophisticated multi-label fill in seung-lab/fastmorph#10

Fill voids is an older package with limited dependencies that I intended (in my own brain) to keep for binary images exclusively for simplicity. Would you mind taking a look at fastmorph and seeing if it has the features you need?

william-silversmith · 2026-04-30T03:18:37Z

Going to close this due to perceived redundancy with fastmorph. If you see something I missed please reopen! Thanks for contributing!

william-silversmith closed this Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fill_multi_label: efficient multi-label void filling#13

Add fill_multi_label: efficient multi-label void filling#13
donglaiw wants to merge 1 commit into
seung-lab:masterfrom
PytorchConnectomics:master

donglaiw commented Apr 17, 2026

Uh oh!

william-silversmith commented Apr 17, 2026

Uh oh!

william-silversmith commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

donglaiw commented Apr 17, 2026

Summary

What it does

How it works

Performance

Tests

New dependency

Test plan

Uh oh!

william-silversmith commented Apr 17, 2026

Uh oh!

william-silversmith commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants