Skip to content

feat: add timeout kwarg to get_background_job#540

Merged
rasdani merged 1 commit intomainfrom
fix/get-background-job-timeout-kwarg
Apr 20, 2026
Merged

feat: add timeout kwarg to get_background_job#540
rasdani merged 1 commit intomainfrom
fix/get-background-job-timeout-kwarg

Conversation

@rasdani
Copy link
Copy Markdown
Collaborator

@rasdani rasdani commented Apr 20, 2026

Summary

All sibling Sandbox / AsyncSandbox methods — execute_command, read_file, upload_file, download_file — already accept a per-call timeout kwarg that overrides the APIClient's 30s httpx default. get_background_job was the lone exception, so callers had no way to extend its effective timeout, even though the method internally just calls read_file (which does accept timeout).

In practice this surfaced as Read file timed out after 30s: /tmp/job_XXX.exit during RL polling loops in verifiers when the sandbox gateway had a brief slow response, bubbling up as AgentError('Agent polling failed: ...').

Change

Add timeout: Optional[int] = None to both the sync and async get_background_job signatures, and thread it through to the internal read_file calls on the exit file, stdout log, and stderr log. Default behavior is unchanged (falls back to APIClient default when timeout is not passed).

Pattern matches the existing execute_command / read_file signatures in the same file (Optional[int], default None).

Follow-up

The verifiers side — CliAgentEnv.poll_job_completion — will consume this with an explicit timeout=60.0 once a release including this change lands. Separate PR.

Test plan

  • Added tests/test_background_job_timeout.py — five unit tests (sync + async) that monkeypatch read_file and assert the kwarg propagates (including the multi-read path when the job is completed, where all three reads should receive the same timeout).
  • uv run --extra dev pytest tests/test_background_job_timeout.py tests/test_gateway_error_mapping.py tests/test_models.py tests/test_client_retry.py — 36 passed.
  • uv run --extra dev ruff check src/ tests/ — clean.
  • uv run --extra dev ruff format --check src/ tests/ — clean.
  • Integration suite (test_command_execution.py, etc.) not run locally — requires real sandbox credentials; behavior should be unchanged since the default path (timeout=None) is preserved.

Note

Low Risk
Low risk: adds an optional timeout parameter and simply forwards it to existing read_file calls, with default behavior unchanged when omitted.

Overview
Adds an optional timeout: Optional[int] = None parameter to both sync and async get_background_job, and forwards it through to the underlying read_file calls for the exit/stdout/stderr polling files.

Introduces new unit tests (test_background_job_timeout.py) that monkeypatch read_file to assert the timeout kwarg is propagated (including the multi-read completed-job path) and that the default remains None when not provided.

Reviewed by Cursor Bugbot for commit 0086e99. Bugbot is set up for automated code reviews on this repo. Configure here.

All sibling Sandbox methods (execute_command, read_file, upload_file,
download_file) accept a per-call `timeout` kwarg; only
get_background_job was missing one. That meant the APIClient's 30s
httpx default fired unconditionally on this path, regardless of how
long the caller wanted to wait.

Thread `timeout: float | None = None` through both sync and async
variants; internally it's forwarded to the `read_file` call that does
the HTTP work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rasdani rasdani merged commit 7a876c0 into main Apr 20, 2026
9 checks passed
@rasdani rasdani deleted the fix/get-background-job-timeout-kwarg branch April 20, 2026 14:18
rasdani added a commit to PrimeIntellect-ai/verifiers that referenced this pull request Apr 20, 2026
… kwarg

Consumes prime-sandboxes PR #540 (PrimeIntellect-ai/prime#540). The
timeout=60.0 kwarg added in the previous commit requires the new
signature; this bump locks in the minimum version.

Lockfile update deferred until the prime-sandboxes 0.2.21 release
(PrimeIntellect-ai/prime#541) lands on the index.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rasdani added a commit to PrimeIntellect-ai/verifiers that referenced this pull request Apr 20, 2026
…letion (#1206)

* fix: pass explicit 60s timeout to get_background_job

Before: each poll iteration inherited the prime-sandboxes APIClient
default of 30s. When the sandbox gateway was briefly slow (>30s for
the background-job status read), the httpx read-timeout fired and
poll_job_completion wrapped it as AgentError — observed ~51 times on
one 30B opencode-math run.

Pass timeout=60.0 explicitly. Doubles the tolerance for a transient
gateway blip without globally bumping the APIClient default (which
would affect unrelated fast lookups).

Requires prime-sandboxes PR adding the timeout kwarg to
get_background_job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump prime-sandboxes to v0.2.21 for get_background_job timeout kwarg

Consumes prime-sandboxes PR #540 (PrimeIntellect-ai/prime#540). The
timeout=60.0 kwarg added in the previous commit requires the new
signature; this bump locks in the minimum version.

Lockfile update deferred until the prime-sandboxes 0.2.21 release
(PrimeIntellect-ai/prime#541) lands on the index.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert "chore: bump prime-sandboxes to v0.2.21 for get_background_job timeout kwarg"

This reverts commit 874faa8.

* chore: bump prime-sandboxes to v0.2.21 for get_background_job timeout kwarg

0.2.21 is now published to PyPI, re-applying the pin bump reverted
earlier to unblock CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants