Data loaders: prefer nuclei_seg, fall back to nuclear_seg#15
Open
mark-a-potts wants to merge 1 commit into
Open
Data loaders: prefer nuclei_seg, fall back to nuclear_seg#15mark-a-potts wants to merge 1 commit into
mark-a-potts wants to merge 1 commit into
Conversation
Updates the two cell-extraction data paths to try the new native-20x `nuclei_seg` label first (produced by `submit_nuclei_segmentation_jobs` in ops_process), with fall-through to the legacy 5x-upscaled `nuclear_seg` from `segment_and_stitch_pheno`. * src/ops_model/data/data_loader.py:466 (CellProfileDataset) * src/ops_model/features/cp_extraction.py:1019 (bulk CP feature read) Both labels are 20x-shaped at level 0 in phenotyping_v3.zarr, so bbox slicing is unchanged. Measured impact on per-cell features over 500 sampled cells from ops0094 A/1/0: * Mean nuc area: -1.80% * Mean nuc/cell: -1.78% * Per-cell nuc IoU: mean 0.845, median 0.856 * Outlier cells (IoU < 0.5): 0.8% Pairs with the ops_process PR (royerlab/ops_process#113) that introduces the new segmentation step. Migration is per-experiment: experiments that have only run the legacy step continue to work; experiments that have run the new step transparently pick up the better masks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Updates the two cell-extraction data paths to try the new native-20x
nuclei_seglabel first (produced bysubmit_nuclei_segmentation_jobsin ops_process PR #113), with fall-through to the legacy 5x-upscalednuclear_segfromsegment_and_stitch_pheno.Files touched (each is a 9-line addition, identical pattern):
src/ops_model/data/data_loader.py:466—CellProfileDataset.__getitem__src/ops_model/features/cp_extraction.py:1019— bulk CP feature readBoth labels are 20x-shaped at level 0 in phenotyping_v3.zarr (the legacy step segmented at 5x and then 4× nearest-neighbor upscaled to 20x for storage), so bbox slicing is unchanged. The existing
min_h = min(...)clip block atdata_loader.py:498handles the residual ~3 px shape diff between native-20x and 5x-upscaled-to-20x masks.Test plan / measured impact
Per-cell A/B comparison over 500 sampled cells from
A1_linked_pheno_iss_cp.csv(ops0094_20251217):Per-cell nuclear-mask IoU (NEW vs LEGACY, within the cell mask):
Migration semantics
Cross-PR
Pairs with: royerlab/ops_process#113
🤖 Generated with Claude Code