feat: surface swallowed response_future errors in interception proxy by rasdani · Pull Request #1196 · PrimeIntellect-ai/verifiers

rasdani · 2026-04-19T19:57:59Z

Summary

Companion to #1194 (SSE keepalive). Fixes two remaining error-swallow paths in interception_utils.py analogous to the chunk-write failure path that #1191 already addressed.

Both paths currently caught exceptions from await response_future and silently let the handler return. On the streaming side, the chunk loop had already written data: [DONE] before the await, so the agent sees a clean "empty completion" → exits 0 → silent empty trajectory. On the non-streaming side, the handler returned HTTP 500 to the agent but never set state["error"], so rubrics/metrics couldn't see the failure.

Changes

Streaming path (_handle_streaming_response post-loop): narrow except BaseException → except Exception with explicit except asyncio.CancelledError: raise (normal teardown via unregister_rollout must not be misclassified as error). Set state["error"] = StreamInterrupted(...) via _set_rollout_error.
Non-streaming path (_handle_request else branch): set state["error"] = InterceptionError(...) before returning HTTP 500.
Add InterceptionError class alongside StreamInterrupted so rubrics/metrics can distinguish the two shapes (streaming cut → truncated SSE body; non-streaming failure → HTTP 500 to agent's OpenAI client).
Keep log levels at debug — messages may contain upstream internals (URLs with tokens, traceback fragments); ERROR severity would ship those to centralized log aggregation unnecessarily.

Tests

Adds two tests in tests/test_interception_utils.py:

test_streaming_response_future_failure_surfaces_to_state
test_non_streaming_response_future_failure_surfaces_to_state

7 tests total pass locally. Ruff + format clean.

Independent of #1194

This branch is off main and does not depend on the keepalive PR. Both can merge in either order.

Note

Medium Risk
Changes error propagation in the interception proxy for both streaming and non-streaming requests, which can alter rollout termination behavior and metrics capture. Risk is moderate because it touches async control flow and cancellation handling but is covered by new tests.

Overview
Ensures failures from response_future are no longer silently swallowed in the interception proxy, so rollouts halt visibly instead of producing empty/clean-looking completions.

For non-streaming requests, exceptions while awaiting the intercepted model result now set state["error"] (via a new InterceptionError) before returning HTTP 500. For streaming requests, exceptions from the post-stream response_future await now set state["error"] as StreamInterrupted, while CancelledError is explicitly re-raised to avoid misclassifying normal teardown.

Adds tests covering both streaming and non-streaming response_future failure paths.

^{Reviewed by Cursor Bugbot for commit 7c105f5. Bugbot is set up for automated code reviews on this repo. Configure here.}

When the underlying model call for a streaming request fails (e.g. vLLM raised mid-call and `synthesize_stream(error=X)` was invoked), the post-loop `response_future` await in `_handle_streaming_response` raises. Previously that was only logged at debug, letting the chunk loop's clean `data: [DONE]` arrive at the agent. The agent then observes a syntactically-valid but empty completion, exits 0, and the rollout finishes with no `state["error"]` set — same silent-empty class that #1191 fixed for the mid-stream transport case. Funnel this failure into `state["error"]` as `StreamInterrupted` via `_set_rollout_error`. Narrow the catch to `except Exception` + explicit `except asyncio.CancelledError: raise` so normal teardown cancellation (via `unregister_rollout` → `future.cancel()`) is not misclassified as a stream error. Control flow is unchanged: the stream loop's `data: [DONE]` write and the subsequent `write_eof` + `return response` still run — only the error bookkeeping is new. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Counterpart to the streaming response_future fix for the non-streaming branch of `_handle_request`. When the upstream model call fails and `deliver_response` sets the future's exception, the await re-raises. Previously we only logged at debug + returned HTTP 500; now we also set `state["error"]` via `_set_rollout_error`. Add distinct `InterceptionError` class alongside `StreamInterrupted` so rubrics and metrics can tell the two failure shapes apart: a streaming cut leaves the agent with a truncated SSE body; a non-streaming failure returns HTTP 500 to the agent's OpenAI client (agent sees a normal API error). Keep the log level at debug — the exception message may include upstream internals (URLs with tokens, tracebacks) and shouldn't ship to centralized logs at ERROR severity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The two ``logger.error`` sites in ``_handle_streaming_response`` include the raw exception message from aiohttp's transport / the upstream model call. Those messages can contain URLs with tokens, Authorization headers in tracebacks, or other internals. At ERROR severity they propagate to centralized log aggregation where anyone with log access sees them verbatim. Drop both to ``debug`` — local triage still has access, but nothing sensitive ships to aggregators. Matches the treatment of the ``response_future`` error logs in #1196. The rollout itself still halts visibly: ``_set_rollout_error`` sets ``state["error"]`` and ``MultiTurnEnv.has_error`` fires at the next ``is_completed()`` check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mikasenghaas

lgtm

rasdani and others added 2 commits April 19, 2026 19:56

rasdani mentioned this pull request Apr 19, 2026

fix: narrow send_cancel BaseException catch to Exception #1198

Draft

rasdani requested a review from mikasenghaas April 20, 2026 15:03

mikasenghaas approved these changes Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: surface swallowed response_future errors in interception proxy#1196

feat: surface swallowed response_future errors in interception proxy#1196
rasdani wants to merge 2 commits intomainfrom
fix/interception-error-plumbing

rasdani commented Apr 19, 2026 •

edited by cursor Bot

Loading

Uh oh!

mikasenghaas left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rasdani commented Apr 19, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Tests

Independent of #1194

Uh oh!

mikasenghaas left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rasdani commented Apr 19, 2026 •

edited by cursor Bot

Loading