What happened
During an autonomous /be run on a workstation that also runs production kolu + kaval (PR #1374), the agent ran the kaval unit suite locally for fast feedback:
packages/kaval/node_modules/.bin/vitest run --root packages/kaval # ×2
packages/kaval/src/socketDaemon.test.ts spawns real kaval daemon processes (node --import <tsx loader> bin.ts) + real kaval-tui processes — ~30+ real processes across the two runs (fingerprint: 182 leftover /tmp/kaval-e2e-* dirs). The tests are isolated (per-test mkdtemp sockets, PID-scoped kills — they never touch the prod $XDG_RUNTIME_DIR/kaval-* socket), so nothing killed prod kaval directly. But the process/memory spike OOM-reaped the user's long-lived production kaval (a crash, not a kill). The agent should have run these on CI / a pu box (where they pass green), never on the workstation.
Root cause
The kaval test binary forks real daemons with no gate inside the binary. Existing guards wrap commands (the /test skill routes e2e to pu boxes; dev-server uses random ports; do.md forbids bare just dev) — but a bare vitest --root <pkg> reaches past the command to the binary, bypassing all of them. The fix must live in the test binary, because only that is independent of how it's invoked, which harness runs it, or which directory it's in.
Two traps (found by an adversarial review pass)
- kaval was just where it landed — the danger surface is wider. The same bypass works on:
packages/surface-daemon/src/daemonMain.test.ts, pidGate.test.ts — fork real node -e
packages/surface-daemon-supervisor/src/waitForPidGone.test.ts — spawns real sleep 30
packages/kaval-tui/src/{attach,create}.test.ts — fork ~6 real shell PTYs via terminal.spawn
packages/tests/support/hooks.ts (Cucumber) — spawns a real kolu server + Chromium + Xvfb (gigabytes if run locally)
- Gating on
CI is broken here. odu runs unit tests as nix develop -c pnpm test:unit with no CI env exported. describe.runIf(process.env.CI) would silently skip daemon tests in CI too (coverage → 0), or — if widened to honor a locally-set CI — re-arm the danger on the workstation. CI is the wrong key.
The fix that closes the class (impossible-by-construction)
Defense-in-depth (guardrails, never the barrier)
Net: layers 1–3 make the workstation safe by construction even with a careless agent or a new harness; 4–6 are insurance.
Filed from the /be run on PR #1374 (P2.5 frontDaemonOverStdio). Design produced + adversarially verified via an ultracode workflow; full verdict in the session transcript.
What happened
During an autonomous
/berun on a workstation that also runs production kolu + kaval (PR #1374), the agent ran the kaval unit suite locally for fast feedback:packages/kaval/node_modules/.bin/vitest run --root packages/kaval # ×2packages/kaval/src/socketDaemon.test.tsspawns realkavaldaemon processes (node --import <tsx loader> bin.ts) + realkaval-tuiprocesses — ~30+ real processes across the two runs (fingerprint: 182 leftover/tmp/kaval-e2e-*dirs). The tests are isolated (per-testmkdtempsockets, PID-scoped kills — they never touch the prod$XDG_RUNTIME_DIR/kaval-*socket), so nothing killed prod kaval directly. But the process/memory spike OOM-reaped the user's long-lived production kaval (a crash, not a kill). The agent should have run these on CI / a pu box (where they pass green), never on the workstation.Root cause
The kaval test binary forks real daemons with no gate inside the binary. Existing guards wrap commands (the
/testskill routes e2e to pu boxes;dev-serveruses random ports; do.md forbids barejust dev) — but a barevitest --root <pkg>reaches past the command to the binary, bypassing all of them. The fix must live in the test binary, because only that is independent of how it's invoked, which harness runs it, or which directory it's in.Two traps (found by an adversarial review pass)
packages/surface-daemon/src/daemonMain.test.ts,pidGate.test.ts— fork realnode -epackages/surface-daemon-supervisor/src/waitForPidGone.test.ts— spawns realsleep 30packages/kaval-tui/src/{attach,create}.test.ts— fork ~6 real shell PTYs viaterminal.spawnpackages/tests/support/hooks.ts(Cucumber) — spawns a real kolu server + Chromium + Xvfb (gigabytes if run locally)CIis broken here. odu runs unit tests asnix develop -c pnpm test:unitwith noCIenv exported.describe.runIf(process.env.CI)would silently skip daemon tests in CI too (coverage → 0), or — if widened to honor a locally-setCI— re-arm the danger on the workstation.CIis the wrong key.The fix that closes the class (impossible-by-construction)
describeDaemon(...)— keyed only onKOLU_DAEMON_TESTS=1, default OFF, applied at every real-spawn test site (the files above). A barevitest runon any package then forks nothing, regardless of invocation path.*.test.tsforks a real process without routing through that gate (grep forspawn(process.execPath,node --import,servePtyHostOverUnixSocket, realssh/nix, PTY-forkingterminal.spawn). This makes it structural rather than discipline — a future spawning test can't exist ungated.KOLU_DAEMON_TESTS=1(CI runsjust test-unit→ full coverage), verified against a real odu unit log that the daemonit(...)lines actually execute.Defense-in-depth (guardrails, never the barrier)
just ai::apmdoesn't clobber it) — that denies localvitest/pnpm test:unit/cucumberand steers the agent to CI/pu. Bypassable + harness-scoped, so a tripwire, not a wall. (Do not trigger on "is a kaval live" — the resource spike harms even with no kaval up, and a detector can't tell prod kaval from ajust devone or the test's own/tmpgates → false blocks that train people to disable it.)just test-unitstays fork-free (the safe reach);just test-daemon(sets the opt-in) is documented as CI/pu-only.Net: layers 1–3 make the workstation safe by construction even with a careless agent or a new harness; 4–6 are insurance.
Filed from the
/berun on PR #1374 (P2.5frontDaemonOverStdio). Design produced + adversarially verified via an ultracode workflow; full verdict in the session transcript.