Clamp COCO polygon vertices to bbox bounds (fix #2847) #6904
Clamp COCO polygon vertices to bbox bounds (fix #2847) #6904soham02 wants to merge 4 commits intovoxel51:fix-coco-polygon-with-clampfrom
Conversation
WalkthroughThe changes add optional bbox_bounds propagation through the COCO segmentation pipeline so generated segmentations are clamped to bounding-box bounds when requested. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@tests/unittests/utils/test_coco_polygon_clamp.py`:
- Around line 20-38: The test currently can pass vacuously if _mask_to_polygons
returns an empty list; add a non-emptiness check before calling
_assert_polygons_within_bounds by asserting that the segmentation returned from
_mask_to_polygons in
test_mask_to_polygons_clamps_vertices_to_bbox_bounds_regression_2847 is not
empty (e.g., assert segmentation or assert len(segmentation) > 0) so the bounds
check actually validates produced vertices.
- Around line 41-58: Add a non-vacuous check to the test
test_mask_to_polygons_respects_rounded_bbox_bounds_num_decimals_edgecase: after
calling _mask_to_polygons(mask, tolerance=None, bbox_bounds=rounded_bbox_bounds)
assert that len(segmentation) > 0 before calling _assert_polygons_within_bounds;
this ensures the test fails if segmentation is empty and prevents a vacuous pass
while keeping the rest of the logic (use of _mask_to_polygons and
_assert_polygons_within_bounds) unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/unittests/utils/test_coco_polygon_clamp.py`:
- Around line 43-50: Update the docstring in
test_mask_to_polygons_respects_rounded_bbox_bounds_num_decimals_edgecase to
replace the vague phrase "CodeRabbit's concern" with a concise description of
the actual concern: explain that when a bounding box is rounded (via
num_decimals), subsequent clamping must use the rounded bbox bounds so polygon
vertices are guaranteed to lie inside the rounded box; keep the rest of the
docstring intact and, if available, reference the relevant issue or reproduce
the rounding scenario succinctly.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@fiftyone/utils/coco.py`:
- Around line 2405-2415: The final negative-clamp pass is redundant when
bbox_bounds was applied via np.clip; modify the code so the list-comprehension
segmentation = [max(0, i) for i in segmentation] only runs when bbox_bounds is
None (i.e., when we did not clip) — locate the block using the names
bbox_bounds, contour, np.clip and segmentation and wrap the max(0, ...) step in
an if bbox_bounds is None: guard (or equivalent) so you avoid the unnecessary
work when bbox_bounds was provided.
- Around line 1307-1315: In the isinstance(label, fol.Detection) branch the
local variable bbox is always set, so remove the redundant "bbox is not None"
condition and only guard on num_decimals; update the block that sets bbox =
[round(...)] and bbox_bounds = (...) to run when num_decimals is not None (e.g.,
change the if to "if num_decimals is not None") and keep using bbox, bbox_bounds
as before to simplify the logic.
In `@tests/unittests/utils/test_coco_polygon_clamp.py`:
- Around line 75-96: Replace the duck-typed DummyPolyline in the test with an
actual fol.Polyline instance to avoid fragile attribute coverage: construct a
fol.Polyline with the same points used in the test and, if necessary, call or
set any attribute values so get_attribute_value behavior matches the original
(i.e., returns None/default); then pass that fol.Polyline into
_polyline_to_coco_segmentation and keep the same assertions (len(seg), bounds
checks, and float presence) around the returned segmentation.
| bbox_bounds = None | ||
| if bbox is not None and num_decimals is not None: | ||
| bbox = [round(p, num_decimals) for p in bbox] | ||
| bbox_bounds = ( | ||
| float(bbox[0]), | ||
| float(bbox[1]), | ||
| float(bbox[0] + bbox[2]), | ||
| float(bbox[1] + bbox[3]), | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
bbox is not None is always true here — simplify the guard.
Inside the isinstance(label, fol.Detection) branch, bbox is unconditionally assigned on line 1305.
Proposed simplification
bbox_bounds = None
- if bbox is not None and num_decimals is not None:
+ if num_decimals is not None:
bbox = [round(p, num_decimals) for p in bbox]
bbox_bounds = (🤖 Prompt for AI Agents
In `@fiftyone/utils/coco.py` around lines 1307 - 1315, In the isinstance(label,
fol.Detection) branch the local variable bbox is always set, so remove the
redundant "bbox is not None" condition and only guard on num_decimals; update
the block that sets bbox = [round(...)] and bbox_bounds = (...) to run when
num_decimals is not None (e.g., change the if to "if num_decimals is not None")
and keep using bbox, bbox_bounds as before to simplify the logic.
There was a problem hiding this comment.
I agree with Rabbit. We assign a value to bbox, so it can’t be None. We can remove this from the conditional.
There was a problem hiding this comment.
@AdonaiVera, sounds good! The simplified condition will make the logic clearer.
| # find_contours at 0.5 produces sub-pixel coordinates, and bbox coords may be float. | ||
| # Clamp to bbox bounds to ensure COCO validity (segmentation must be within bbox). | ||
| if bbox_bounds is not None: | ||
| xmin, ymin, xmax, ymax = bbox_bounds | ||
| contour[:, 0] = np.clip(contour[:, 0], xmin, xmax) | ||
| contour[:, 1] = np.clip(contour[:, 1], ymin, ymax) | ||
|
|
||
| segmentation = contour.ravel().tolist() | ||
|
|
||
| # After padding and subtracting 1 there may be -0.5 points | ||
| segmentation = [0 if i < 0 else i for i in segmentation] | ||
| # After padding and subtracting 1 there may be negative points | ||
| segmentation = [max(0, i) for i in segmentation] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
The max(0, i) pass (line 2415) is now redundant when bbox_bounds is provided.
When bbox_bounds is set, np.clip already guarantees all values ≥ min(xmin, ymin) ≥ 0 for any valid COCO bbox. The max(0, i) list comprehension still runs on the entire flattened polygon, doing pointless work.
Not a correctness issue — just wasted cycles on every polygon. Consider guarding it:
Proposed fix
segmentation = contour.ravel().tolist()
- # After padding and subtracting 1 there may be negative points
- segmentation = [max(0, i) for i in segmentation]
+ # After padding and subtracting 1 there may be negative points.
+ # When bbox_bounds is provided, np.clip already ensures non-negative
+ # values for valid (non-negative) bounding boxes.
+ if bbox_bounds is None:
+ segmentation = [max(0, i) for i in segmentation]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # find_contours at 0.5 produces sub-pixel coordinates, and bbox coords may be float. | |
| # Clamp to bbox bounds to ensure COCO validity (segmentation must be within bbox). | |
| if bbox_bounds is not None: | |
| xmin, ymin, xmax, ymax = bbox_bounds | |
| contour[:, 0] = np.clip(contour[:, 0], xmin, xmax) | |
| contour[:, 1] = np.clip(contour[:, 1], ymin, ymax) | |
| segmentation = contour.ravel().tolist() | |
| # After padding and subtracting 1 there may be -0.5 points | |
| segmentation = [0 if i < 0 else i for i in segmentation] | |
| # After padding and subtracting 1 there may be negative points | |
| segmentation = [max(0, i) for i in segmentation] | |
| # find_contours at 0.5 produces sub-pixel coordinates, and bbox coords may be float. | |
| # Clamp to bbox bounds to ensure COCO validity (segmentation must be within bbox). | |
| if bbox_bounds is not None: | |
| xmin, ymin, xmax, ymax = bbox_bounds | |
| contour[:, 0] = np.clip(contour[:, 0], xmin, xmax) | |
| contour[:, 1] = np.clip(contour[:, 1], ymin, ymax) | |
| segmentation = contour.ravel().tolist() | |
| # After padding and subtracting 1 there may be negative points. | |
| # When bbox_bounds is provided, np.clip already ensures non-negative | |
| # values for valid (non-negative) bounding boxes. | |
| if bbox_bounds is None: | |
| segmentation = [max(0, i) for i in segmentation] |
🤖 Prompt for AI Agents
In `@fiftyone/utils/coco.py` around lines 2405 - 2415, The final negative-clamp
pass is redundant when bbox_bounds was applied via np.clip; modify the code so
the list-comprehension segmentation = [max(0, i) for i in segmentation] only
runs when bbox_bounds is None (i.e., when we did not clip) — locate the block
using the names bbox_bounds, contour, np.clip and segmentation and wrap the
max(0, ...) step in an if bbox_bounds is None: guard (or equivalent) so you
avoid the unnecessary work when bbox_bounds was provided.
There was a problem hiding this comment.
Agree with Rabbit, can you apply this one, please
There was a problem hiding this comment.
Task: Apply optimization to avoid redundant max(0, i) pass when bbox_bounds is provided
- Modify the
_mask_to_polygonsfunction infiftyone/utils/coco.py - Guard the
max(0, i)list comprehension so it only runs whenbbox_bounds is None - Update the comment to explain the conditional logic
@AdonaiVera, I've planned the optimization! This change will skip the redundant max(0, i) pass when bbox_bounds is provided, since np.clip already guarantees non-negative values for valid bounding boxes. The optimization will improve performance without affecting correctness.
💡 For code changes across multiple files, use @coderabbitai plan in an issue comment
| class DummyPolyline: | ||
| def __init__(self, points): | ||
| self.points = points | ||
|
|
||
| def get_attribute_value(self, key, default=None): | ||
| return None | ||
|
|
||
| width, height = 1000, 1000 | ||
| polyline = DummyPolyline( | ||
| points=[[(0.75365, 0.5), (1.0001, 0.6), (-0.1, 0.2)]] | ||
| ) | ||
|
|
||
| seg = _polyline_to_coco_segmentation(polyline, (width, height)) | ||
|
|
||
| assert len(seg) == 1 | ||
| poly = seg[0] | ||
|
|
||
| xs = poly[0::2] | ||
| ys = poly[1::2] | ||
| assert all(0.0 <= x <= float(width) for x in xs) | ||
| assert all(0.0 <= y <= float(height) for y in ys) | ||
| assert any(isinstance(v, float) for v in poly) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using fol.Polyline instead of the duck-typed DummyPolyline.
fol.Polyline already exposes .points and .get_attribute_value; using it removes the maintenance risk of the test breaking if the internal API adds a new attribute access.
Proposed fix
- class DummyPolyline:
- def __init__(self, points):
- self.points = points
-
- def get_attribute_value(self, key, default=None):
- return None
-
- width, height = 1000, 1000
- polyline = DummyPolyline(
- points=[[(0.75365, 0.5), (1.0001, 0.6), (-0.1, 0.2)]]
- )
+ width, height = 1000, 1000
+ polyline = fol.Polyline(
+ points=[[(0.75365, 0.5), (1.0001, 0.6), (-0.1, 0.2)]],
+ closed=False,
+ filled=True,
+ )🧰 Tools
🪛 Ruff (0.14.14)
[warning] 76-76: Missing return type annotation for special method __init__
Add return type annotation: None
(ANN204)
[warning] 79-79: Missing return type annotation for private function get_attribute_value
Add return type annotation: None
(ANN202)
[warning] 79-79: Unused method argument: key
(ARG002)
[warning] 79-79: Unused method argument: default
(ARG002)
🤖 Prompt for AI Agents
In `@tests/unittests/utils/test_coco_polygon_clamp.py` around lines 75 - 96,
Replace the duck-typed DummyPolyline in the test with an actual fol.Polyline
instance to avoid fragile attribute coverage: construct a fol.Polyline with the
same points used in the test and, if necessary, call or set any attribute values
so get_attribute_value behavior matches the original (i.e., returns
None/default); then pass that fol.Polyline into _polyline_to_coco_segmentation
and keep the same assertions (len(seg), bounds checks, and float presence)
around the returned segmentation.
AdonaiVera
left a comment
There was a problem hiding this comment.
Oh this is great @soham02, thank you so much for this PR!
I validated it locally and the functionality looks good. At first it was hard to see the difference with the normal dataset, so I created a graph to compare, and the issue is solved.

I left a few small comments in the PR, mostly around FO code style and minor improvements. Let me know once you’ve addressed them and I’ll be happy to take another look. Thanks again! 🙌
| bbox_bounds = None | ||
| if bbox is not None and num_decimals is not None: | ||
| bbox = [round(p, num_decimals) for p in bbox] | ||
| bbox_bounds = ( | ||
| float(bbox[0]), | ||
| float(bbox[1]), | ||
| float(bbox[0] + bbox[2]), | ||
| float(bbox[1] + bbox[3]), | ||
| ) |
There was a problem hiding this comment.
I agree with Rabbit. We assign a value to bbox, so it can’t be None. We can remove this from the conditional.
| for x, y in points: | ||
| polygon.append(int(x * width)) | ||
| polygon.append(int(y * height)) | ||
| px = x * width |
There was a problem hiding this comment.
I know I suggested this internally @soham02 , but I think it’s better to leave this out of this PR.
The _polyline_to_coco_segmentation change switches the output from int to float, which isn’t related to issue #2847 (polylines don’t go through that code path). I’d suggest reverting the function and removing the related test. If we want clamping for polylines, we can tackle that in a separate PR while keeping the int() conversion.
| assert all(ymin <= y <= ymax for y in ys) | ||
|
|
||
|
|
||
| def test_polyline_to_coco_segmentation_preserves_float_and_clamps_to_image(): |
There was a problem hiding this comment.
We can remove this unit test based on the suggestion above.
| # Compute bbox bounds in absolute pixels so we can clamp polygon vertices. | ||
| # Why: COCO bboxes can be floats (e.g. 753.65) but contours come from a pixel grid, | ||
| # which can create ~1px drift outside the bbox. Clamping preserves mask shape while | ||
| # ensuring COCO validity (segmentation must be within bbox). |
There was a problem hiding this comment.
This is a bit verbose here, we could simplify it to something like:
# Clamp polygon vertices to bbox bounds to ensure COCO validity
| # ensuring COCO validity (segmentation must be within bbox). | ||
| if bbox_bounds is None: | ||
| x, y, w, h = detection.bounding_box # relative [0..1] | ||
| bbox_xmin = x * width |
There was a problem hiding this comment.
The intermediate variables (bbox_xmin, etc.) could be inlined to match how from_label computes the same thing, just following more FO code styles.
if bbox_bounds is None:
x, y, w, h = detection.bounding_box
bbox_bounds = (
float(x * width),
float(y * height),
float(x * width + w * width),
float(y * height + h * height),
)
|
|
||
|
|
||
| def _mask_to_polygons(mask, tolerance): | ||
| # bbox_bounds is optional. When provided, we clamp polygon vertices so they |
There was a problem hiding this comment.
Can you remove this comments, please
| # find_contours at 0.5 produces sub-pixel coordinates, and bbox coords may be float. | ||
| # Clamp to bbox bounds to ensure COCO validity (segmentation must be within bbox). | ||
| if bbox_bounds is not None: | ||
| xmin, ymin, xmax, ymax = bbox_bounds | ||
| contour[:, 0] = np.clip(contour[:, 0], xmin, xmax) | ||
| contour[:, 1] = np.clip(contour[:, 1], ymin, ymax) | ||
|
|
||
| segmentation = contour.ravel().tolist() | ||
|
|
||
| # After padding and subtracting 1 there may be -0.5 points | ||
| segmentation = [0 if i < 0 else i for i in segmentation] | ||
| # After padding and subtracting 1 there may be negative points | ||
| segmentation = [max(0, i) for i in segmentation] |
There was a problem hiding this comment.
Agree with Rabbit, can you apply this one, please
| @@ -0,0 +1,96 @@ | |||
| import unittest.mock as mock | |||
There was a problem hiding this comment.
Can you add the docstring here
"""
Tests for fiftyone/utils/coco.py polygon clamping.
| Copyright 2017-2026, Voxel51, Inc.
| `voxel51.com <https://voxel51.com/>`_
|
"""
|
|
||
| if bbox is not None: | ||
| if num_decimals is not None: | ||
| # NOTE: bbox rounding for Detection is handled above (before segmentation) |
brimoor
left a comment
There was a problem hiding this comment.
Just flagging: we should reconcile this PR with voxel51/eta#693 before merging, as they are likely related.
What changes are proposed in this pull request?
This PR fixes an issue where COCO segmentation polygon vertices could drift slightly outside their corresponding bounding boxes (typically by ~1px) when exporting instance masks.
The root cause was a mismatch between float bounding box coordinates (e.g. 753.65) and pixel-grid–derived polygon vertices. This resulted in invalid COCO annotations where segmentation points exceeded bbox bounds.
The fix clamps polygon vertices to the computed bounding box limits during mask → polygon conversion, preserving the mask shape while ensuring COCO validity.
A regression unit test has been added to verify that exported COCO polygons always lie within their bounding boxes across multiple edge cases (float bbox edges, near-origin bboxes, and tiny boxes).
Fixes #2847.
How is this patch tested? If it is not, please explain why.
Added a new unit test: test_coco_polygon_clamp.py that constructs synthetic detections with masks and float bounding boxes and asserts that all exported COCO polygon vertices remain within bbox bounds.
Verified locally that the test fails prior to the fix and passes after applying the clamp logic.
Manually validated on a small dataset that COCO exports no longer contain out-of-bounds polygon vertices.
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
Release note:
Fixes an issue where COCO segmentation polygons exported from instance masks could contain vertices slightly outside their bounding boxes, improving COCO annotation validity for downstream tools.
What areas of FiftyOne does this PR affect?
fiftyonePython library changesSummary by CodeRabbit
Bug Fixes
Tests