[FEATURE] Enable/disable collision and visualization for rigid geometries.#2695
[FEATURE] Enable/disable collision and visualization for rigid geometries.#2695vlordier wants to merge 18 commits intoGenesis-Embodied-AI:mainfrom
Conversation
- Add history_length parameter to ContactForce sensor options - Override read() to return historical force readings from ring buffer - Update return format to include history dimension - Extend ring buffer size to accommodate history
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a47f265b5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| """ | ||
| Read the ground truth sensor data (without noise). | ||
| """ | ||
| return self.read(envs_idx) |
There was a problem hiding this comment.
Return ground-truth cache in read_ground_truth
read_ground_truth() now calls read(), but the default history_length == 1 path in read() fetches from get_cloned_from_cache(self) (non-ground-truth cache). That means callers asking for ground truth can silently receive delayed/noisy/clipped values whenever sensor noise options are enabled, which breaks the method contract and can corrupt evaluation/training logic that depends on raw forces.
Useful? React with 👍 / 👎.
| if n_envs == 0: | ||
| hist = hist.reshape(3) | ||
| else: | ||
| hist = hist.reshape(n_envs, 3) |
There was a problem hiding this comment.
Reshape history using requested env count
In batched mode, this reshape uses total scene env count (n_envs) instead of the number of requested environments (len(envs_idx)). Calling read(envs_idx=...) with a subset (e.g., one env out of many) will produce a size mismatch and raise at runtime when history_length > 1.
Useful? React with 👍 / 👎.
| n_envs = self._manager._sim.n_envs | ||
| history_data = [] | ||
| for i in range(history_length): | ||
| hist = buffered_data.at(i, envs_idx, cache_slice) |
There was a problem hiding this comment.
Keep history reads consistent with sensor noise model
For history_length > 1, read() pulls values directly from _buffered_data instead of the processed sensor cache. In this class, _buffered_data is written before delay/jitter/noise/quantization/clipping are applied, so history reads bypass the configured sensor model and return raw contact forces while read() is expected to return measured data.
Useful? React with 👍 / 👎.
| self.active_envs_mask = torch.zeros(self._solver._B, dtype=torch.bool, device=gs.device) | ||
| self.active_envs_idx = np.array([], dtype=np.int_) |
There was a problem hiding this comment.
Propagate collision toggle to collider inputs
disable_collision() only updates Python-side active_envs_mask/active_envs_idx, but collision broadphase/narrowphase consume solver fields (geom ranges/pair filters) and do not read these attributes. Because this method never updates solver/collider state, calling it after build does not actually remove the geom from collision detection/response.
Useful? React with 👍 / 👎.
| self.active_envs_mask = torch.zeros(self._solver._B, dtype=torch.bool, device=gs.device) | ||
| self.active_envs_idx = np.array([], dtype=np.int_) |
There was a problem hiding this comment.
Propagate visualization toggle to render inputs
disable_visualization() also only mutates active_envs_mask/active_envs_idx, but those flags are not consumed by the rendering pipeline (they are only used in AABB helper paths). As a result, this API does not hide the visual geom in rendered frames even though the method name and docstring promise that behavior.
Useful? React with 👍 / 👎.
| x_idx = x / h_scale | ||
| y_idx = y / h_scale |
There was a problem hiding this comment.
Convert world query point into terrain-local frame
These index computations treat world (x, y) as if terrain coordinates start at the origin with no rotation. Terrain entities can be instantiated with non-zero morph.pos/morph.quat, so get_height_at()/get_normal_at() return incorrect values whenever the terrain is translated or rotated.
Useful? React with 👍 / 👎.
- Fix reshape crash in read() for batched env subsets: use n_query_envs instead of global n_envs when reshaping history slices - Restore read_ground_truth() contract: return noise-free ground truth cache instead of delegating to read() which returns processed data - Fix _draw_debug() env index: use cache[env_idx] instead of hardcoded cache[0] for correct environment in multi-env scenes Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Adds entity.get_height_at(x, y) to query terrain height at world position using bilinear interpolation from height field - Adds entity.get_normal_at(x, y) to compute surface normal at world position from height field gradient - Both methods handle boundary conditions gracefully Closes Genesis-Embodied-AI#2094
…se transform - Fix coordinate indexing: hf[y,x] -> hf[x,y] since heightfield is stored as [row, col] where row corresponds to x - Add pose transformation: convert world coords to terrain local frame using inv_transform_by_trans_quat(terrain_pos, terrain_quat) - Transform normals back to world frame with transform_by_quat - Height now includes terrain z-offset (terrain_pos[2]) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…ents - Add fallback to in _get_gpu_indices() when /proc/driver/nvidia/gpus/ is unavailable - Add fallback to in _torch_get_gpu_idx() when /proc interface is missing - Handles cloud GPU instances and containers lacking /proc interface - Graceful degradation to single-GPU mode when both methods fail Closes Genesis-Embodied-AI#2683
…th collider - enable_collision() and disable_collision() only update Python-side tracking attributes; the collider pipeline does not yet consume active_envs_mask/active_envs_idx for collision filtering - Add runtime warnings and doc warnings to prevent silent failures - Visualization toggles (enable/disable_visualization) work correctly as they are consumed by the rasterizer context Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
a47f265 to
88950de
Compare
…odied-AI#2602) - Add target_entity_idx parameter to Raycaster options - Filter raycast hits to only include geometry from target entity - Add kernel entity filtering logic (face -> geom -> link -> entity chain) - Add example script demonstrating mesh-specific raycasting - Useful for manipulation RL where surface observations are needed from specific objects only Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…-Embodied-AI#2656) - Add PBD solver support to GenesisGeomRetriever - Add retrieve_pbd_meshes_static() method for PBD visual meshes - Combine rigid and PBD mesh data for Madrona renderer - PBD entities with vis_mode='visual' are now rendered in batch camera output Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…karound - Update enable_collision()/disable_collision() warnings to suggest entity hibernation as the proper workaround - Add docstring examples showing entity.hibernate()/wake_up() usage - Clarify that geom-level collision filtering is planned for future release Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…e (Issue Genesis-Embodied-AI#1899) - Fix _draw_debug() to use read_ground_truth() instead of read() - Debug spheres now show actual contact state, not delayed sensor readings - Eliminates visual mismatch reported in issue Genesis-Embodied-AI#1899 - Add test script demonstrating the fix Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…ed-AI#2563) - Add process_input() method to KinematicEntity to apply target commands - Add substep() method to KinematicSolver for velocity integration - Add kernel_integrate_dofs_velocity() to integrate vel to pos - Fix set_dofs_velocity() not persisting after scene.step() - Add test script demonstrating the fix Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
|
@claude review |
| def substep(self, f): | ||
| self._coupler.preprocess(f) | ||
| self.substep_pre_coupling(f) | ||
| # Call substep on each active solver for time integration | ||
| for solver in self._active_solvers: | ||
| solver.substep(f) | ||
| self._coupler.couple(f) | ||
| self.substep_post_coupling(f) | ||
|
|
There was a problem hiding this comment.
🔴 The new loop for solver in self._active_solvers: solver.substep(f) in simulator.substep() causes RigidSolver.substep() to execute twice per substep in the non-_rigid_only path. Since substep_pre_coupling() already calls rigid_solver.substep(f) internally, the full rigid physics pipeline runs a second time, causing physics instability. The fix is to guard the loop to only call substep() on non-RigidSolver solvers (e.g., KinematicSolver).
Extended reasoning...
What the bug is: In simulator.substep() (line 323-325), this PR adds a new loop for solver in self._active_solvers: solver.substep(f). The intent is to invoke kinematic_solver.substep(f) for DOF velocity integration — a genuinely new feature. However, the loop iterates over all active solvers, including RigidSolver.
The specific code path that triggers it: In the non-_rigid_only path, simulator.substep() calls self.substep_pre_coupling(f), which in turn iterates over all active solvers and calls rigid_solver.substep_pre_coupling(f). Per the verifiers, RigidSolver.substep_pre_coupling() at line 1238 already internally calls self.substep(f) — running the full rigid physics pipeline (kernel_step_1, kernel_step_2, etc.) as its first action. Immediately after, the new loop calls rigid_solver.substep(f) a second time.
Why existing code does not prevent it: The _rigid_only fast path in simulator.step() bypasses simulator.substep() entirely and calls rigid_solver.substep() directly — so that path is unaffected. But the _rigid_only=False condition is triggered by: any multi-solver scene (Rigid+Kinematic, Rigid+MPM, Rigid+PBD, etc.), use of SAPCoupler or IPCCoupler (SAPCoupler is the default, making even pure rigid scenes vulnerable), or requires_grad=True. In all these cases the general substep() path is used and double-execution occurs.
Impact: The rigid physics pipeline runs twice per substep: positions and velocities are integrated, constraints solved, and collisions resolved twice. This produces incorrect trajectories, energy non-conservation, and potentially explosive instability, especially visible in mixed-solver scenes and any scene using the default SAPCoupler.
Step-by-step proof: Consider a scene with a rigid box plus a kinematic entity (forces _rigid_only=False). At each substep: (1) simulator.substep(f) is called; (2) substep_pre_coupling(f) runs -> rigid_solver.substep_pre_coupling(f) -> internally calls self.substep(f) [first execution of full rigid physics]; (3) then the new loop runs -> rigid_solver.substep(f) [second execution]; (4) then coupling and post-coupling proceed on the already-doubly-integrated state. The kinematic solver call in step (3) is valid, but including the rigid solver is the error.
How to fix it: The loop should skip RigidSolver instances, for example:
for solver in self._active_solvers: if not isinstance(solver, RigidSolver): solver.substep(f)
Alternatively, RigidSolver.substep_pre_coupling could stop calling self.substep(f) internally and delegate to the new loop exclusively, but that is a larger refactor.
| if envs_idx is None: | ||
| n_query_envs = n_envs if n_envs > 0 else 0 | ||
| else: | ||
| n_query_envs = len(envs_idx) | ||
|
|
||
| history_data = [] | ||
| for i in range(history_length): | ||
| hist = buffered_data.at(i, envs_idx, cache_slice) | ||
| if n_envs == 0: | ||
| hist = hist.reshape(3) | ||
| else: | ||
| hist = hist.reshape(n_query_envs, 3) | ||
| history_data.append(hist) | ||
|
|
||
| result = torch.stack(history_data, dim=1) | ||
| return result.squeeze(1) if n_envs == 0 else result | ||
|
|
||
| @gs.assert_built | ||
| def read_ground_truth(self, envs_idx=None) -> torch.Tensor: | ||
| """ | ||
| Read the ground truth sensor data (without noise). | ||
| """ | ||
| envs_idx = self._sanitize_envs_idx(envs_idx) | ||
| history_length = self._options.history_length | ||
|
|
||
| # Get ground truth from the ground truth cache (no noise/delay/quantization) | ||
| gt_cache = self._manager.get_cloned_from_cache(self, is_ground_truth=True) |
There was a problem hiding this comment.
🔴 When history_length > 1, ContactForceSensor.read() silently returns raw ground-truth contact forces instead of noise-processed data, because the ring buffer (_buffered_data) is populated with ground-truth values before the noise/delay/quantization pipeline is applied to shared_cache. Users who configure noise, bias, jitter, quantization, or clipping on a ContactForce sensor with history_length > 1 will receive unprocessed sensor readings without any warning. To fix this, maintain a separate ring buffer for processed (noisy) data written after the full noise pipeline runs.
Extended reasoning...
What the bug is and how it manifests:
In ContactForceSensor._update_shared_cache() (contact_force.py), processing occurs in this order: (1) buffered_data.set(shared_ground_truth_cache) stores the raw ground-truth contact forces into the ring buffer; (2) _apply_delay_to_shared_cache reads the delayed value from the ring buffer into shared_cache; (3) _add_noise_drift_bias adds noise/drift/bias to shared_cache; (4) the max-force clamp, min-force mask, and quantization are applied to shared_cache. Therefore _buffered_data exclusively holds ground-truth data, while the processed (noisy) result lives only in shared_cache.
The specific code path that triggers it:
When read() is called with history_length > 1 (lines 209-235 in the PR diff), the method iterates and reads from buffered_data.at(i, envs_idx, cache_slice) which indexes directly into _manager._buffered_data -- the ring buffer that contains only ground-truth values. In contrast, when history_length == 1, read() correctly returns self._manager.get_cloned_from_cache(self), which returns a clone of shared_cache -- the tensor that has had noise/delay/quantization applied.
Why existing code does not prevent it:
There is no secondary ring buffer tracking the post-processing state of shared_cache. The only mechanism that buffers past sensor values is _buffered_data, and by design it is written before any noise processing. The history feature was added in this PR without accounting for the noise-model indirection that separates ground truth from processed output.
Step-by-step proof:
Consider a sensor configured with noise=1.0, history_length=3, and a ground-truth contact force of [10, 0, 0] N. After _update_shared_cache runs: the ring buffer slot 0 contains [10, 0, 0] (raw); shared_cache contains approximately [10 + epsilon, epsilon, epsilon] where epsilon is drawn from N(0,1). Calling read() returns a (3, 3) tensor assembled from buffered_data.at(0,...), buffered_data.at(1,...), buffered_data.at(2,...) -- each of which is the raw ground-truth value from previous steps. None of the three history slots contain the noisy values, so the returned tensor is identical to what read_ground_truth() would return.
Impact:
Any user who sets history_length > 1 on a ContactForce sensor with noise/bias/jitter/quantization/clipping will silently receive clean physics data instead of the realistic noisy sensor readings they configured. This silently breaks sim-to-real transfer experiments and sensor-model validation that rely on historical readings. Because there is no error or warning, the issue goes completely undetected.
How to fix it:
Maintain a second ring buffer (e.g., _noisy_buffered_data) that is written after the full noise/clamp/quantization pipeline has been applied to shared_cache -- at the end of _update_shared_cache. Then have read() draw its history slices from _noisy_buffered_data instead of _buffered_data. read_ground_truth() can continue to use _buffered_data (which correctly holds raw values) for both its history_length == 1 and history_length > 1 paths.
| self.debug_objects.append(debug_sphere) | ||
|
|
||
| # Draw ray from sensor to hit point | ||
| if self._options.return_world_frame: | ||
| ray_start = self.ray_starts_world[i].cpu().numpy() | ||
| else: | ||
| ray_start = self.ray_starts[i].cpu().numpy() | ||
| link_pos = self._link.get_pos(env_idx).reshape(3) | ||
| link_quat = self._link.get_quat(env_idx).reshape(4) | ||
| ray_start = transform_by_trans_quat(ray_start, link_pos, link_quat) | ||
|
|
There was a problem hiding this comment.
🔴 In the new _draw_debug() implementation, self.ray_starts_world[i] is accessed when return_world_frame=True, but ray_starts_world is never assigned to the RaycasterSensor instance — only self.ray_starts is set in build(). This will raise AttributeError at runtime whenever draw_debug=True and return_world_frame=True. The fix is to use self._shared_metadata.ray_starts_world[i] (or compute the world-frame start by transforming self.ray_starts[i] through the link pose, consistent with the local-frame branch above it).
Extended reasoning...
What the bug is: In the rewritten RaycasterSensor._draw_debug() method, the branch handling return_world_frame=True accesses self.ray_starts_world[i] (raycaster.py line ~374). This attribute does not exist on RaycasterSensor instances — it is only declared as a field on RaycasterSharedMetadata (line 175: ray_starts_world: torch.Tensor = make_tensor_field((0, 3))).
The specific code path: Any user who creates a Raycaster sensor with both return_world_frame=True and draw_debug=True, then calls scene.step() (which triggers _draw_debug during visualization), will hit this crash. It fires on the very first rendered frame.
Why existing code does not prevent it: RaycasterSensor.init only sets self.ray_starts = torch.empty((0, 3), ...). The build() method sets self.ray_starts from the pattern transform but never sets self.ray_starts_world. There is no hasattr guard or fallback in _draw_debug().
Impact: A complete runtime crash (AttributeError: RaycasterSensor object has no attribute ray_starts_world) when debug visualization is enabled with world-frame output. This makes the newly introduced debug path fully unusable for the return_world_frame=True case.
Step-by-step proof:
- Create a Raycaster sensor with return_world_frame=True and draw_debug=True.
- Build the scene and call scene.step().
- The visualizer calls sensor._draw_debug(context).
- A hit is found (dist < max_range), so execution enters the inner block.
- Because self._options.return_world_frame is True, execution reaches: ray_start = self.ray_starts_world[i].cpu().numpy()
- Python raises AttributeError because ray_starts_world is never assigned on the sensor instance.
How to fix it: Replace self.ray_starts_world[i] with self._shared_metadata.ray_starts_world[i]. Alternatively, since ray_starts_world in the shared metadata is also never populated anywhere in this PR, the more robust fix is to compute it inline as transform_by_trans_quat(self.ray_starts[i:i+1], link_pos, link_quat).squeeze(0).cpu().numpy(), mirroring exactly how the local-frame branch below it works.
1. simulator.py: remove erroneous solver.substep() loop that caused RigidSolver.substep() to execute twice per step (once via substep_pre_coupling and again via the loop). Move kinematic DOF velocity integration to KinematicSolver.substep_pre_coupling() where it belongs. 2. contact_force.py: fix wrong torch.stack dim for non-batched history (dim=1 on shape-(3,) tensors gives (3,H) not (H,3)); use cache_size instead of hardcoded 3; deduplicate read()/read_ground_truth() history paths into _read_history(); remove unused Type import. 3. raycaster.py: fix AttributeError in _draw_debug() — self.ray_starts_world does not exist on the sensor instance (only in shared metadata, and never populated); always compute world-frame start from self.ray_starts + link transform. Remove dead ray_starts_world/ray_dirs_world metadata fields. Also replace old bilinear get_height_at/get_normal_at (no env_idx) with the triangle-based versions from PR 2691 that match the physics mesh convention and support batched scenes via env_idx.
…sGeom The rendering pipeline does not yet consume active_envs_mask for per-geom visibility filtering, mirroring the existing warning on the collision toggle methods. Flagged by Codex review on PR Genesis-Embodied-AI#2695.
Summary
Added public API methods to enable/disable collision and visualization for rigid geometries.
This feature allows users to toggle collision detection and visualization on/off for individual rigid geometries after they have been added to the scene. This is useful for:
Changes
Modified
genesis/engine/entities/rigid_entity/rigid_geom.pyto add:set_collision_enabled(bool)methodset_visualization_enabled(bool)methodUsage
Testing