[MISC] Fix flaky CI: isolate offscreen GL contexts, stabilize render/physics tests.#2951
Conversation
…e the pixel-match helper
…test_nonconvex_concentric_contact
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d959c4da4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| from PIL import Image | ||
| from syrupy.extensions.image import PNGImageSnapshotExtension | ||
|
|
||
| from .utils import IMG_BLUR_KERNEL_SIZE, IMG_NUM_ERR_THR, IMG_STD_ERR_THR, assert_pixel_match |
There was a problem hiding this comment.
Move the utils import after test environment setup
This top-level import now loads tests/utils.py, which imports both mujoco and genesis, before this file sets MUJOCO_GL and PYGLET_HEADLESS below. On headless Linux/EGL workers that means MuJoCo/pyglet/Genesis can be initialized with the wrong GL backend during pytest collection, before the headless settings at lines 65-75 take effect. Please keep the pixel constants/helper import below the environment setup or split the helper into a module that does not import Genesis/MuJoCo.
Useful? React with 👍 / 👎.
| # process/thread-global, so making it current here would clobber the context another renderer may be | ||
| # using (e.g. while it is mid-render, when this renderer is being torn down by garbage collection). | ||
| # 'delete_context' makes itself current only when the platform requires it. | ||
| self._platform.delete_context() |
There was a problem hiding this comment.
Stop teardown from making stale EGL contexts current
This change only stops OffscreenRenderer.delete() itself from forcing its context current, but the normal scene teardown path still calls self._renderer.make_current(); self._renderer.delete() in Rasterizer.destroy(). In the mid-render GC/scene_b.destroy() scenario modeled by the new test, scene B therefore still steals the thread's current EGL context and then EGLPlatform.delete_context() uncurrents it, leaving scene A with no current context. Please remove or guard the caller-side make_current() as well, otherwise the production path and the newly required regression test still hit the original failure.
Useful? React with 👍 / 👎.
Description
Fixes flaky CI failures on software renderers and GPU runners. Three independent changes:
Attempt to retrieve context when no valid context.OffscreenRenderer.delete()no longer forces its context current, andEGLPlatform.delete_context()releases the current context only when it is actually its own. Only surfaces on EGL (Linux/GPU).test_deformable_uv_texturesdisables shadows and shrinks the ground plane to stay within the frustum on the Apple Software Renderer (Rasterizer only; the RayTracer keeps its scene and snapshot). The pixel comparison is factored into a sharedassert_pixel_matchhelper.test_convexify(0.5 -> 0.6) andtest_nonconvex_concentric_contact(0.06 -> 0.07).How Has This Been Tested?
New regression test
test_offscreen_context_isolationforces the mid-render teardown: it fails onmainwith the exact error and passes with the fix (verified on Linux EGL via Mesa llvmpipe and locally on macOS). Render and rigid-physics suites pass locally.Checklist: