Fix ±1px shift in _render_instance for objects_to_segmentations#7006
Fix ±1px shift in _render_instance for objects_to_segmentations#7006
Conversation
WalkthroughThe change replaces a library-based mask rendering call with a manual, coordinate-based placement routine in the labels module. The new code computes the detection bounding-box origin (rounded), converts to absolute pixel coordinates, clips coordinates to image bounds, slices both the detection mask and the target image patch to the clipped region, converts the detection mask to a boolean object mask, and assigns mask values into the clipped patch. The same clipping-and-slicing approach is applied for both 3-channel (color) and single-channel (grayscale) patches. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@fiftyone/core/labels.py`:
- Around line 1786-1787: Remove the redundant outer int() calls on the x0 and y0
assignments: use the result of round(...) directly (e.g., assign x0 =
round(bbox.top_left.x * img_w) and y0 = round(bbox.top_left.y * img_h)); update
the two lines referencing bbox.top_left.x, bbox.top_left.y, img_w and img_h to
drop int(...) so the values remain integers via round() in Python 3.
- Around line 1792-1797: The mask slicing drops the wrong side when bbox x0/y0
are negative because x0/y0 are clamped before slicing; capture the original
unclamped offsets (e.g., orig_x0 = x0, orig_y0 = y0) or compute x_off = max(0,
-x0) and y_off = max(0, -y0) before clamping, then clamp x0/y0 and x1/y1 as done
now, and slice obj_mask using those offsets and the cropped size: obj_mask =
obj_mask[y_off : y_off + (y1 - y0), x_off : x_off + (x1 - x0)]; this preserves
the correct interior of the mask when the box extends past the top/left edges.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@fiftyone/core/labels.py`:
- Around line 1801-1811: The slice assigned to variable patch in the mask
editing block (inside the mask.ndim == 3 branch and the else branch) is a view
of mask, so the final write-backs mask[y0:y1, x0:x1, :] = patch and mask[y0:y1,
x0:x1] = patch are redundant; remove those final assignments and rely on the
in-place mutations patch[obj_mask, ...] = target to update mask. Update the code
in the function/block that contains mask, patch, obj_mask, y0, y1, x0, x1 to
delete only the redundant write-back lines while keeping the existing
slice-to-patch and patch[obj_mask,...] assignments intact.
There was a problem hiding this comment.
@v-nayjack thanks for proposing a fix. I think there should be a way to apply the round() vs floor() logic in etai.render_instance_mask() rather than bypassing it in FiftyOne in a way that avoids the +1px shift that you're seeing while also maintaining the intended behavior in the other possible use cases (like I've described below).
| bbox = dobj.bounding_box | ||
| img_h, img_w = mask.shape[:2] | ||
|
|
||
| # Use round() instead of int() for the start position to avoid a ±1px |
There was a problem hiding this comment.
Can you apply the necessary fix inside etai.render_instance_mask() rather than bypassing it here? The code is here.
| # Use round() instead of int() for the start position to avoid a ±1px | ||
| # shift caused by independently flooring both the start and end | ||
| # coordinates. The mask is placed at its actual dimensions with no | ||
| # resize, and clipped to image bounds. |
There was a problem hiding this comment.
I don't think it is possible to avoid the mask resize that etai.render_instance_mask() does in general.
_render_instance() needs to be able to inscribe an instance segmentation defined by detection into a mask that represents the whole image regardless of the resolutions of the two masks. For example, we could have:
- User wants to render a full image mask with relatively small resolution, say
mask.shape = (16, 16) - The instance segmentation could be much higher resolution, say:
detection.bounding_box=[0, 0, 0.5, 0.5](upper left quadrant) butdetection.mask.shape = (512, 512)
That's going to require a resize, because the 512x512 mask only represents the 8x8 quadrant of mask.
For reference, in FiftyOne's data model, we have:
Masks can be of any size; they are stretched as necessary to fill the object’s bounding box when visualizing in the App.
|
Thanks @brimoor, that makes sense. The resize is necessary when the mask resolution differs from the target region. I'll apply the fix in |
|
@brimoor updated This applies |
What changes are proposed in this pull request?
Fix
±1pxpixel shift inobjects_to_segmentations()caused by_render_instanceindependently flooring both start and end bounding box coordinates viaint().When converting normalized bbox coordinates to pixel positions, the previous implementation delegated to
etai.render_instance_mask, which callsBoundingBox.coords_in(). This independently floors both corners:Due to floating-point arithmetic,
floor(a + b) != floor(a) + floor(b). When the computed target size differs from the actual mask dimensions, the mask is resized using interpolation, shifting pixels by 1-2px. The shift is intermittent — it only affects detections with fractional pixel coordinates.Fix: Replace
_render_instanceto bypassetai.render_instance_maskentirely:round()instead ofint()How is this patch tested? If it is not, please explain why.
Tested with 10 synthetic cases covering:
floor(a+b) != floor(a) + floor(b)on both axes9 out of 10 synthetic cases had
±1pxmismatch with the oldint()approach. All10pass with the fix.Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
Fixed
±1pxpixel shift inobjects_to_segmentations().What areas of FiftyOne does this PR affect?
fiftyonePython library changesSummary by CodeRabbit