Skip to content

Commit 8aa67cf

Browse files
authored
fix: restart inherited sandbox sessions on metadata changes (#43)
Treat inherited sandbox metadata changes as worker-session boundaries, while keeping empty polls, bare interrupts, and pager-local commands local until a real worker interaction needs current metadata. Document the restart contract and add regression coverage for timeout bundles, session-ended pager commands, missing metadata recovery, explicit restarts, and guardrail recovery.
1 parent de3cd2f commit 8aa67cf

11 files changed

Lines changed: 2737 additions & 603 deletions

docs/plans/completed/codex-sandbox-state-meta-migration.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
## Status
1111

1212
- State: completed
13-
- Last updated: 2026-04-18
13+
- Last updated: 2026-04-19
1414
- Current phase: completed
1515

1616
## Design Intent
@@ -69,8 +69,13 @@
6969
- Ordered sandbox plans still validate earlier operations before later mode resets; later-wins resolution does not silently discard earlier invalid CLI/config ops.
7070
- `--debug-repl --sandbox inherit` remains locally usable by bootstrapping one inherited snapshot from the current default sandbox state before the first worker spawn.
7171
- `repl_reset` derives inherited sandbox state from the current tool call's `_meta["codex/sandbox-state-meta"]`.
72-
- Non-empty `repl` calls derive inherited sandbox state from the current tool call's `_meta["codex/sandbox-state-meta"]` before executing fresh code.
73-
- Empty-input `repl` polls ignore per-call sandbox metadata when they can be answered from existing state, but they still apply the current tool call's metadata before spawning a worker to answer an idle call on a fresh session.
72+
- Non-empty `repl` calls resolve stale timeout markers before deciding whether they still belong to a prior timed-out request.
73+
- Bare `Ctrl-C` is the one non-empty follow-up that remains a local recovery control and does not force a sandbox-driven restart.
74+
- Every other non-empty `repl` call requires valid current `_meta["codex/sandbox-state-meta"]`.
75+
- If current metadata changes the effective inherited sandbox, `mcp-repl` restarts the worker before handling that non-empty call and includes a reply notice naming the new sandbox policy.
76+
- Control-prefixed tails such as `Ctrl-C<code>` and `Ctrl-D<code>` run in the restarted session when the sandbox changed; the control prefix itself is not replayed into the fresh worker.
77+
- While the pager is active, pure pager navigation remains local UI state and ignores sandbox metadata until a later tool call actually interacts with the worker again.
78+
- Empty-input `repl` polls ignore per-call sandbox metadata when they can be answered from existing state, but they still apply the current tool call's metadata before any spawn or respawn needed to answer the call, including after draining a session-ended request.
7479
- When a prior timed-out request has already settled, `mcp-repl` resolves the stale timeout marker before deciding whether a new non-empty `repl` call is still just a busy follow-up.
7580
- Missing or malformed metadata fails closed with the existing inherit error path.
7681
- Explicit non-`inherit` sandbox modes ignore Codex metadata.
@@ -88,7 +93,7 @@
8893

8994
- Current Codex source and live traces both showed the old async update protocol was obsolete for the current release line.
9095
- The migration stayed intentionally single-path: no compatibility layer for older Codex builds.
91-
- Follow-up review fixes tightened the runtime sequencing so sandbox metadata is applied only for fresh execution or worker spawn, not for empty-input polls that are only draining prior output or using an already-running idle session.
96+
- Follow-up review fixes and final contract clarification tightened the runtime sequencing so sandbox changes now define worker-session boundaries for non-empty calls: empty polls keep draining, bare interrupts stay local, and other non-empty interactions restart into the current inherited sandbox when it changed.
9297

9398
## Decision Log
9499

@@ -97,3 +102,4 @@
97102
- 2026-04-17: Chose per-tool-call `_meta["codex/sandbox-state-meta"]` as the source of truth after inspecting current Codex source and live traces.
98103
- 2026-04-17: Completed the repo migration and verification against the real current Codex integration tests.
99104
- 2026-04-18: Clarified the shipped contract for `repl`: empty-input polls ignore per-call sandbox metadata only when they can be answered from existing state, while fresh non-empty calls resolve stale timeout markers and then apply the current call's sandbox metadata before executing new code.
105+
- 2026-04-19: Replaced the earlier local-follow-up exception set with a simpler restart-on-change contract: empty polls keep draining, bare interrupts remain local, and other non-empty interactions use current metadata and restart the worker when the inherited sandbox changed.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# Inherit Sandbox Restart Contract
2+
3+
## Summary
4+
5+
- Change `--sandbox inherit` so new per-tool-call sandbox metadata takes effect by restarting the worker at the next non-poll, non-bare-interrupt interaction.
6+
- Keep empty-input polls and bare `Ctrl-C` as the two cases that do not force a restart on sandbox change.
7+
- Keep explicit non-`inherit` sandbox modes authoritative and unchanged.
8+
9+
## Status
10+
11+
- State: completed
12+
- Last updated: 2026-04-19
13+
- Current phase: completed
14+
15+
## Current Direction
16+
17+
- Treat sandbox changes as worker-session boundaries.
18+
- If current tool-call metadata changes the effective inherited sandbox, restart the worker before handling any tool call that would otherwise send input, restart the worker, or otherwise interact with the worker statefully.
19+
- Empty-input polls keep draining existing state without forcing a restart.
20+
- A bare `Ctrl-C` remains a local recovery control and does not force a restart just because sandbox metadata changed.
21+
22+
## Long-Term Direction
23+
24+
- The inherit contract should stay simple enough to explain in one paragraph:
25+
fresh worker interaction uses the current metadata, and sandbox changes reset the session before that interaction happens.
26+
- Review-driven exceptions should be minimized; only the explicit poll and bare interrupt escape hatches should remain.
27+
28+
## Phase Status
29+
30+
- Phase 0: completed
31+
- Identified that the current fail-closed follow-up split is not the desired product behavior.
32+
- Phase 1: completed
33+
- Reworked runtime sequencing around restart-on-change semantics.
34+
- Phase 2: completed
35+
- Refreshed tests, docs, and final verification.
36+
37+
## Locked Decisions
38+
39+
- Do not revert to the obsolete async sandbox update protocol.
40+
- Do not let explicit non-`inherit` CLI sandbox modes depend on Codex metadata.
41+
- If the inherited sandbox changes, the next non-poll, non-bare-interrupt interaction should reset the worker instead of trying to preserve the old request/session.
42+
43+
## Open Questions
44+
45+
- What exact restart notice text best communicates both the restart cause and the new effective sandbox policy without creating brittle snapshots?
46+
- Should a bare `Ctrl-C` with no live worker remain a no-op control reply or continue to surface the existing idle/session behavior?
47+
48+
## Decision Log
49+
50+
- 2026-04-19: Replaced the earlier fail-closed control-tail contract with a restart-on-change contract for non-poll, non-bare-interrupt interactions.
51+
- 2026-04-19: Landed the runtime, docs, and regression updates; restart notices are informational and initial inherit-mode spawns do not pretend they were restarts.

docs/sandbox.md

Lines changed: 39 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,45 @@ metadata channel in that mode, `mcp-repl --debug-repl --sandbox inherit`
2121
bootstraps one local inherited snapshot from the current default sandbox state
2222
before the first worker spawn.
2323

24-
For `repl`, empty-input polls ignore per-call sandbox metadata when they can be
25-
answered from existing state, such as draining a timed-out request or returning an
26-
idle prompt from an already-running worker. If an empty-input call must spawn a
27-
worker to answer the call, `mcp-repl` applies the current tool call's sandbox
28-
metadata before that spawn. Non-empty `repl` calls resolve any stale timeout
29-
marker first, then apply the current call's sandbox metadata before executing
30-
fresh code. If a timed-out request is still genuinely in flight, follow-up calls
31-
continue servicing that request instead of switching sandboxes mid-flight.
24+
For `repl`, inherited sandbox metadata controls the worker session that handles
25+
the call. When a non-empty tool call would use the worker and the effective
26+
inherited sandbox changed, `mcp-repl` restarts the worker before serving that
27+
call and includes a restart notice that names the new sandbox policy.
28+
29+
More specifically:
30+
31+
- Empty-input polls ignore per-call sandbox metadata while they are only
32+
draining existing pending or settled output, or returning an idle prompt from
33+
an already-running worker.
34+
- If an empty-input poll needs to spawn or respawn a worker to finish answering
35+
the call, `mcp-repl` applies the current tool call's metadata before that
36+
spawn. If a poll can first answer by draining a session-ended request, it
37+
returns that local drain without respawning; the next spawn-needed call must
38+
provide valid current metadata.
39+
- While the pager is active, pure pager navigation is local UI state, not a
40+
worker interaction. Pager-local commands such as `:q` or empty-string page
41+
advance ignore sandbox metadata until a later tool call actually interacts
42+
with the worker again. Bare `Ctrl-D` is not pager navigation; it remains an
43+
explicit restart even when the pager is active.
44+
- Bare `Ctrl-C` is the one non-empty `repl` follow-up that stays local and does
45+
not force a sandbox-driven restart.
46+
- Every other non-empty `repl` call must have valid current
47+
`_meta["codex/sandbox-state-meta"]`.
48+
- A non-empty retry after the memory guardrail aborts a worker is an ordinary
49+
non-empty call. It must have valid current metadata before `mcp-repl` resets
50+
or retries under `--sandbox inherit`.
51+
- Non-empty `repl` calls resolve stale timeout markers before deciding whether
52+
they are still looking at a live worker request.
53+
- If current metadata changes the effective inherited sandbox, `mcp-repl`
54+
restarts the worker at that call before handling the input.
55+
- Control-prefixed tails such as `Ctrl-C<code>` and `Ctrl-D<code>` run in the
56+
restarted session when the sandbox changed; the control prefix itself is not
57+
replayed into the fresh worker.
58+
- Explicit restarts discard preserved detached output from aborted prior
59+
requests instead of carrying it into later unrelated replies.
60+
- Sandbox metadata is enforced again at the next tool call that actually
61+
interacts with the worker after pager navigation ends.
62+
- Missing or malformed metadata still fails closed on calls that need it.
3263

3364
The worker also gets a per-session temp directory, exported as:
3465

src/debug_repl.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ pub(crate) fn run(
7070

7171
while let Some(line) = read_line(&mut stdin)? {
7272
if is_exact_command(&line, "INTERRUPT") {
73-
let reply = worker.interrupt(DEFAULT_WRITE_STDIN_TIMEOUT);
73+
let reply = worker.interrupt(DEFAULT_WRITE_STDIN_TIMEOUT, None, false);
7474
render_visible_reply(
7575
response.as_mut(),
7676
reply,

src/sandbox_cli.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,13 @@ pub fn resolve_effective_sandbox_state(
197197
resolve_effective_sandbox_state_with_defaults(plan, inherited, &defaults)
198198
}
199199

200+
pub fn validate_sandbox_plan_with_defaults(
201+
plan: &SandboxCliPlan,
202+
defaults: &SandboxState,
203+
) -> Result<(), String> {
204+
validate_sandbox_plan_operations(plan, None, defaults)
205+
}
206+
200207
pub fn resolve_effective_sandbox_state_with_defaults(
201208
plan: &SandboxCliPlan,
202209
inherited: Option<&SandboxState>,

0 commit comments

Comments
 (0)