Skip to content

Fix realtime disconnect cancellation and response timeouts#225

Closed
andimarafioti wants to merge 1 commit intohuggingface:mainfrom
andimarafioti:codex/fix-realtime-disconnect-and-timeout
Closed

Fix realtime disconnect cancellation and response timeouts#225
andimarafioti wants to merge 1 commit intohuggingface:mainfrom
andimarafioti:codex/fix-realtime-disconnect-and-timeout

Conversation

@andimarafioti
Copy link
Copy Markdown
Member

Summary

  • cancel and flush realtime pipeline state when the last websocket client disconnects
  • keep the discard guard active across reconnects so stale outputs from the previous session are not sent to a new client
  • ignore SESSION_END control messages in the realtime audio send loop
  • add an explicit OpenAI Responses read timeout and end the current response cleanly on timeout

Why

Two failure modes showed up in realtime use:

  1. A stuck upstream responses.create(..., stream=True) call could hang for minutes before failing, which meant the robot never answered.
  2. After disconnect/reconnect, stale output from the previous conversation could leak into the new session. When the old session finally unwound, the realtime send loop could also crash on PipelineControlMessage because it tried to treat SESSION_END as audio bytes.

This patch makes disconnect behave like a real cancellation boundary and makes the LLM client fail fast enough to recover instead of leaving the session stuck.

Validation

  • python3 -m py_compile LLM/openai_api_language_model.py api/openai_realtime/websocket_router.py tests/test_openai_api_language_model.py tests/openai_realtime/test_websocket_router.py
  • uv run ruff check LLM/openai_api_language_model.py api/openai_realtime/websocket_router.py tests/test_openai_api_language_model.py tests/openai_realtime/test_websocket_router.py
  • uv run pytest -q tests/openai_realtime/test_websocket_router.py
  • uv run pytest -q tests/test_openai_api_language_model.py -k "streams_text_from_response_events or handles_cancellation or read_timeout or disable_thinking or no_disable_thinking"

Note

tests/test_openai_api_language_model.py::test_second_turn_flattens_assistant_history_for_responses is already failing on upstream/main for an unrelated assistant-history formatting issue, so I kept this PR scoped to the realtime cancellation/timeout fixes above.

@andimarafioti andimarafioti changed the title [codex] Fix realtime disconnect cancellation and response timeouts Fix realtime disconnect cancellation and response timeouts Apr 7, 2026
@andimarafioti andimarafioti marked this pull request as ready for review April 7, 2026 17:05
@A-Mahla A-Mahla closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants