fix: pass explicit 60s timeout to get_background_job in poll_job_completion#1206
Merged
fix: pass explicit 60s timeout to get_background_job in poll_job_completion#1206
Conversation
Before: each poll iteration inherited the prime-sandboxes APIClient default of 30s. When the sandbox gateway was briefly slow (>30s for the background-job status read), the httpx read-timeout fired and poll_job_completion wrapped it as AgentError — observed ~51 times on one 30B opencode-math run. Pass timeout=60.0 explicitly. Doubles the tolerance for a transient gateway blip without globally bumping the APIClient default (which would affect unrelated fast lookups). Requires prime-sandboxes PR adding the timeout kwarg to get_background_job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… kwarg Consumes prime-sandboxes PR #540 (PrimeIntellect-ai/prime#540). The timeout=60.0 kwarg added in the previous commit requires the new signature; this bump locks in the minimum version. Lockfile update deferred until the prime-sandboxes 0.2.21 release (PrimeIntellect-ai/prime#541) lands on the index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… timeout kwarg" This reverts commit 874faa8.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit e136b0c. Configure here.
… kwarg 0.2.21 is now published to PyPI, re-applying the pin bump reverted earlier to unblock CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
CliAgentEnv.poll_job_completionpolls background-job status viaself.sandbox_client.get_background_job(...). Without an explicittimeout, that call inherits the prime-sandboxesAPIClientdefault of 30s (httpx.Timeout(30.0, connect=10.0)). When the sandbox gateway is briefly slow on the status read,httpx.ReadTimeoutfires andpoll_job_completionwraps it asAgentError('Agent polling failed: Read file timed out after 30s: /tmp/job_XXX.exit').Observed ~51 such failures on one 30B opencode-math run.
Change
One line, one kwarg:
Why 60s (and not higher)?
APIClientdefault globally would also affect unrelated fast lookups — this keeps the change scoped to the known-slow path.Dependency
Requires prime-sandboxes PR PrimeIntellect-ai/prime#540 (merged), which adds the
timeoutkwarg toget_background_job. Consumed via prime-sandboxes release PR PrimeIntellect-ai/prime#541 (bumps to0.2.21).The
pyproject.tomlconstraint has been bumped toprime-sandboxes>=0.2.21(see commit874faa8f). Lockfile update is deferred until the 0.2.21 release lands on the index —uv lockcurrently fails to resolve because 0.2.21 is not yet published. Once PR #541 merges and the package is published, a follow-up commit will runuv lockto pin the resolved version.Test plan
uv run ruff check verifiers/envs/experimental/cli_agent_env.py— clean.uv run ruff format --check verifiers/envs/experimental/cli_agent_env.py— clean.uv run pytest tests/test_cli_agent_env.py— 17 passed. (No existing coverage ofpoll_job_completion; this change is a kwarg passthrough, so no new tests added.)🤖 Generated with Claude Code
Note
Low Risk
Small, localized change to background-job polling behavior plus a minor dependency bump; main risk is compatibility/regression if the new
prime-sandboxestimeout handling behaves unexpectedly.Overview
Reduces transient
ReadTimeoutfailures when polling sandbox background-job status by passing an explicittimeout=60.0tosandbox_client.get_background_job()inCliAgentEnv.poll_job_completion.Bumps
prime-sandboxesdependency to>=0.2.21to pick up the newtimeoutparameter support.Reviewed by Cursor Bugbot for commit 9f807c5. Bugbot is set up for automated code reviews on this repo. Configure here.