Skip to content

Jeff/fix/rconnect2#1784

Open
jeff-hykin wants to merge 50 commits intodevfrom
jeff/fix/rconnect2
Open

Jeff/fix/rconnect2#1784
jeff-hykin wants to merge 50 commits intodevfrom
jeff/fix/rconnect2

Conversation

@jeff-hykin
Copy link
Copy Markdown
Member

@jeff-hykin jeff-hykin commented Apr 13, 2026

NOTE: this will become ready-for-review once this dimos-viewer PR is merged

Problem(s)

  1. dimos-viewer with --connect doesn't work for remote connections.
  2. there's no way to remap what dimos-viewer publishes on. It publishes to a hardcoded LCM path of cmd_vel which ends up making stream-renaming a pain since everything else must be renamed to know if the cmd_vel is coming from the viewer or from a planner/module.
  3. We do the "if global_config == rerun" in multiple places

Solution

Use websockets instead of LCM for the viewer.

  • A dimos module starts a websocket server listening for clicked_point and tele_cmd_vel
  • The dimos-viewer tries to connect to a websocket and publishes clicked_point and tele_cmd_vel on there.
  • Theres a consolidated vis_module to de-dup the visualization logic

Breaking Changes

None

How to Test

You'll need the jeff/fix/connect branch of dimos-viewer compiled and python installed into dimos:

# Build and install dimos-viewer (requires pixi)
cd <path-to-dimos-viewer>
git checkout jeff/fix/connect
pixi run build
uv pip install target/wheels/dimos_viewer-*.whl --force-reinstall
# Unit + E2E tests
uv run pytest dimos/visualization/rerun/test_websocket_server.py dimos/visualization/rerun/test_viewer_ws_e2e.py -v --timeout=30 -k "not test_viewer_ws_client_connects"

Alternatively

# Terminal 1: start the websocket server
python -c "
from dimos.visualization.rerun.websocket_server import RerunWebSocketServer
import threading
server = RerunWebSocketServer(port=3030)
server.clicked_point.subscribe(lambda pt: print(f'[CLICK] {pt.x:.3f},{pt.y:.3f},{pt.z:.3f}'))
server.tele_cmd_vel.subscribe(lambda tw: print(f'[TWIST] {tw}'))
server.start()
threading.Event().wait()
"

# Terminal 2: start the viewer
dimos-viewer --connect rerun+http://localhost:9877/proxy --ws-url ws://127.0.0.1:3030/ws

Click in the 3D viewport and use WASD keys — should see [CLICK] and [TWIST] in terminal 1.

Contributor License Agreement

  • I have read and approved the CLA.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 13, 2026

Greptile Summary

This PR replaces the LCM-based viewer interaction with a WebSocket server (RerunWebSocketServer) to fix remote --connect mode in dimos-viewer. It consolidates duplicated per-blueprint visualization setup into a single vis_module() factory, and adds a MovementManager module that muxes teleop and nav velocity commands cleanly.

The mechanical changes are solid — subscriptions are properly cleaned up via register_disposable, the MovementManager logic is well-structured, and the websocket server design is clean. Several P1 items flagged in earlier reviews (port collision on 9877, _server_ready never cleared on restart, misleading gRPC hints in foxglove mode, silent bind failure in start()) remain open and should be resolved before merging.

Confidence Score: 3/5

Do not merge yet — several P1 issues from prior rounds remain open (port collision, _server_ready restart bug, silent bind failure, misleading foxglove hints); PR is also self-described as not-ready-for-review pending the dimos-viewer companion PR.

Multiple open P1 findings from previous review rounds are still present in the code: RERUN_WEB_PORT/gRPC port collision at 9877, _server_ready threading.Event never cleared between start/stop cycles, start() not surfacing bind failures, and misleading --connect hints in foxglove mode. The core WebSocket server design and MovementManager logic are sound, but those unresolved issues pull the score well below the P1 ceiling of 4.

dimos/visualization/rerun/websocket_server.py and dimos/visualization/rerun/constants.py need the most attention (port mismatch, restart bug, silent bind failure).

Important Files Changed

Filename Overview
dimos/visualization/rerun/websocket_server.py New WebSocket server for viewer ↔ DimOS communication; has open issues: _server_ready never cleared on restart, start() silently succeeds on bind failure, misleading connect hints use RERUN_GRPC_PORT (9876) while the gRPC server runs on 9877
dimos/visualization/rerun/bridge.py Refactored to use centralized constants; default connect_url uses port 9877 while RERUN_GRPC_PORT constant is 9876, and RERUN_WEB_PORT is also 9877 — web viewer and gRPC server would collide if web mode is enabled
dimos/visualization/vis_module.py New factory function consolidates per-blueprint vis setup; foxglove branch includes RerunWebSocketServer which logs misleading Rerun gRPC connect hints when no gRPC server is present
dimos/visualization/rerun/constants.py New constants module; RERUN_GRPC_PORT=9876 conflicts with connect_url default port 9877, and RERUN_WEB_PORT=9877 collides with that gRPC port if web viewer is enabled
dimos/navigation/smart_nav/modules/movement_manager/movement_manager.py New module cleanly muxes teleop and nav velocity; subscriptions properly registered via register_disposable; teleop cooldown and goal cancellation logic looks correct
dimos/core/global_config.py ViewerBackend narrows to rerun/foxglove/none (removes rerun-web/rerun-connect); rerun_open and rerun_web promoted to global config fields
dimos/robot/cli/dimos.py rerun-bridge command rewritten inline (removes run_bridge); global_config.update() called early in run() as workaround for blueprint evaluation timing; signal.pause() is Unix-only
dimos/visualization/rerun/test_websocket_server.py Good unit test coverage; server fixture lacks try/finally around wait_for_server — TimeoutError would leave port 13031 bound for subsequent tests
dimos/visualization/rerun/test_viewer_ws_e2e.py Good E2E coverage of viewer protocol; server fixture lacks try/finally — wait_for_server TimeoutError would leave port 13032 bound; TestViewerBinaryConnectMode.test_viewer_ws_client_connects is correctly skipped
dimos/navigation/replanning_a_star/module.py cmd_vel output renamed to nav_cmd_vel to connect with MovementManager; stop_movement input added with proper optional subscription pattern

Reviews (14): Last reviewed commit: "Merge branch 'dev' into jeff/fix/rconnec..." | Re-trigger Greptile

Comment thread dimos/visualization/rerun/websocket_server.py
Comment thread dimos/visualization/rerun/constants.py Outdated
Comment thread dimos/visualization/rerun/test_websocket_server.py Outdated
Comment thread dimos/visualization/rerun/websocket_server.py Outdated
@jeff-hykin jeff-hykin enabled auto-merge (squash) April 13, 2026 18:22
@jeff-hykin jeff-hykin mentioned this pull request Apr 17, 2026
6 tasks
Comment thread dimos/visualization/rerun/constants.py Outdated
Comment thread dimos/visualization/rerun/constants.py
@jeff-hykin
Copy link
Copy Markdown
Member Author

jeff-hykin commented Apr 21, 2026

Sorry I forgot to port the project.toml change from rosnav8 back to this branch. Ready now though

Comment thread dimos/visualization/rerun/websocket_server.py Outdated
Comment thread dimos/visualization/rerun/websocket_server.py Outdated
Comment thread dimos/visualization/rerun/websocket_server.py Outdated
Comment thread dimos/visualization/rerun/websocket_server.py Outdated
Comment thread dimos/navigation/smart_nav/modules/movement_manager/movement_manager.py Outdated
Comment thread dimos/core/global_config.py
Comment thread dimos/robot/cli/dimos.py Outdated
Comment thread dimos/visualization/rerun/bridge.py Outdated
Comment thread dimos/visualization/rerun/bridge.py Outdated
Comment thread dimos/visualization/rerun/bridge.py Outdated
Comment thread dimos/visualization/rerun/conftest.py Outdated
Comment thread dimos/visualization/rerun/conftest.py Outdated
module = RerunWebSocketServer(port=_E2E_PORT)
module.start()
wait_for_server(_E2E_PORT)
yield module # type: ignore[misc]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for ignore, we don't type check tests.

Comment on lines +184 to +187
shutil.which("dimos-viewer") is None
or "--connect"
not in subprocess.run(["dimos-viewer", "--help"], capture_output=True, text=True).stdout,
reason="dimos-viewer binary not installed or does not support --connect",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why skip? If dimos-viewer is not present or doesn't support --connect that sounds to me like we should know that. I.e., the test should fail.

yield publisher # type: ignore[misc]


# ── Tests ────────────────────────────────────────────────────────────────
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# ── Tests ────────────────────────────────────────────────────────────────


def test_invalid_json_does_not_crash(server: RerunWebSocketServer) -> None:
"""Malformed JSON is silently dropped; server stays alive for the next message."""
import websockets.asyncio.client as ws_client
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move to the top.

manager._on_teleop(_twist(lx=0.3))

# Nav is suppressed
manager.cmd_vel.publish.reset_mock() # type: ignore[union-attr]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these type ignores are needed. We don't type check tests.

import time
from typing import Any

from dimos_lcm.std_msgs import Bool # type: ignore[import-untyped]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ignore needed? I can see the import in other files without any ignore.

# Conflicts:
#	dimos/robot/unitree/go2/blueprints/smart/unitree_go2.py
#	dimos/visualization/rerun/bridge.py
The section-marker test walks REPO_ROOT and was catching personal
overlay scripts that live outside the main project tree.
paul-nechifor
paul-nechifor previously approved these changes Apr 24, 2026
The toolz pipe() function returns Any, which triggers mypy's
no-any-return when used in a function with a declared return type.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Want your agent to iterate on Greptile's feedback? Try greploops.

Address Paul's review nit: inline imports of websockets.asyncio.client
moved to module-level imports in test_websocket_server.py and
test_viewer_ws_e2e.py.
- Move all inline `import rerun` to top of bridge.py (rerun already
  loaded via other top-level imports)
- Convert wait_for_server from conftest import to pytest fixture
- Move websockets import to top of conftest.py
- Change RERUN_WEB_PORT from 9090 to 9877 (9090 conflicts with VPN/TOR)
Comment thread dimos/visualization/rerun/constants.py
Ruff requires `import X` after `from X import Y` within the same
import group. Fixes pre-commit failure.
Revert the .ignore.enhance entry in test_no_sections.py and replace
with .hidden. Add .hidden/ to .gitignore for personal/overlay dirs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants