docs(reborn): subagent thread harness — foundational design + PR0/PR1 plan by henrypark133 · Pull Request #5176 · nearai/ironclaw

henrypark133 · 2026-06-24T05:10:45Z

Summary

Docs-only. Foundational design + implementation plan for the subagent thread harness (Reborn). Reframes subagents from one-shot computations into first-class, addressable, resumable threads that both the parent agent and a human can attach to — driven by four product requirements:

Parent inspects a child's work anytime.
Human opens a child's thread in the console and continues conversing.
Parent re-activates a finished child with an extension task (no cold start).
Clean-slate, best-in-class harness (informed by a survey of Claude Code/Codex, LangGraph, Temporal, Devin, OpenAI Assistants).

Supersedes the 2026-06-08 subagent-durability sub-spec (removed here) and closes #4656: the SQL gate-resolution stack (3 tables + settlement log + 2-phase ledger + bucketed counter + replay reconciler, ~2,700 LOC) is replaced by a single ScopedFilesystem await-edge file per parent-waits-on-child, with the child's own run record as the source of truth and CAS-based idempotent delivery (no ledger). terminal becomes per-run; threads park and re-activate.

What's here

docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md — the single source of truth for the durable delivery/thread design (hardened over 3 adversarial review rounds).
docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md — bite-sized TDD implementation plan for PR 0 (prereqs) + PR 1 (durable await-edge walking skeleton, blocking-only), converged over 7 thermo-nuclear review rounds (clean: no BLOCKER/MAJOR).
Removes the superseded durability sub-spec. Live-code docs (subagent-spawn, compaction, planner) intentionally kept.

Key design decisions

Child = first-class thread + lineage; terminal is per-run, thread parks and re-activates via one activate(thread, input, provenance) primitive.
Delivery = one filesystem await-edge file; lifecycle open → settled → drained (+ abandoned for parent-dies-first), CAS transitions = idempotency (no ledger).
Blocking reuses TurnStatus::BlockedDependentRun with a real await-edge GateRef (no new status variant) — a subagent wait reads as an ordinary dependent-run wait.
Safety: child→parent content is framed-as-untrusted + capped (no semantic scan); approvals always bubble to a human (no LLM auto-approves) — the catastrophe backstop.
Reborn-only; everything gated behind subagent.v2_enabled (default off).

Rollout

Vertical walking skeleton: PR 0 (flag, helper visibility, config) → PR 1 (await-edge store + resolver + crash-recovery proof + flag-gated blocking delivery) → later PRs add inspect / extend / human-console / cancel.

No code changes in this PR.

🤖 Generated with Claude Code

gemini-code-assist · 2026-06-24T05:10:49Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-06-24T05:11:03Z

📝 Walkthrough

Summary by CodeRabbit

Documentation
- Added a new design spec for durable subagent threads, including the updated lifecycle, single-active-run behavior, and restart recovery flow.
- Added an implementation plan for a flag-gated rollout of durable blocking support, with clear phases for enabling persistence, recovery, and end-to-end validation.
- Clarified the expected behavior for edge settlement, resume handling, and boot-time recovery so the new subagent experience is defined end to end.

Walkthrough

Adds two new documentation files: a foundational design spec (2026-06-23-subagent-thread-harness-design.md) superseding prior subagent durability specs and PR #4656, and an implementation plan (2026-06-23-subagent-thread-harness-pr0-pr1.md) decomposing the work into PR 0 (flag plumbing) and PR 1 (FilesystemAwaitEdgeStore, resolver, and runtime wiring). No source code is changed.

Changes

Durable subagent thread harness

Layer / File(s)	Summary
Thread-first model, await-edge semantics, and blocking wake `docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md`	Establishes scope and superseded artifacts; ratifies core semantic decisions (terminal-per-run, `activate` reactivation, one-run-per-thread, CAS idempotency, depth floor, `subagent.v2_enabled`); introduces two-layer architecture with blocking/background delivery semantics; specifies GateRef-based `TurnStatus::BlockedDependentRun` reuse and boot-recovery for blocking parents.
Thread lifecycle, capability surfaces, and durability storage detail `docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md`	Defines Active/Parked/Archived lifecycle and reactivation invariants; specifies inspect/extend/spawn/cancel tool surfaces; details `ScopedFilesystem` layout, CAS state machine (open→settled→drained/abandoned), resolver settlement/recovery algorithm, capacity enforcement, and explicit delta from PR `#4656`.
Safety/trust model, rollout plan, test plan, and references `docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md`	Defines gate-by-gate approval-bubbling to a human surface; lists open verification items; provides sequential vertical-slice rollout steps behind `subagent.v2_enabled`; enumerates PR-by-PR acceptance criteria; collects references to superseded specs and relevant crates.
PR 0 — flag plumbing and `wrap_untrusted_subagent_text` visibility `docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md`	Defines global constraints (durable CAS await-edge, no production `unwrap`/`expect`, strict flag-gating); enumerates PR 0 file changes: `pub(crate)` on `wrap_untrusted_subagent_text`, `subagent.v2_enabled` TOML/CLI/config/builder wiring, and builder/default tests.
PR 1 tasks 1.1–1.3 — await-edge types, `FilesystemAwaitEdgeStore`, and durability tests `docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md`	Defines await-edge/gate-ref types; `FilesystemAwaitEdgeStore` interface/semantics (open, get, CAS transitions, unclosed-edge listing, boot-recovery enumeration); durability/idempotency and scope-isolation integration tests; spawn depth floor regression test.
PR 1 tasks 1.4–1.6 — resolver, settler, spawn hook, observer, and runtime wiring `docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md`	Specifies `AwaitEdgeResolver` and object-safe `AwaitEdgeSettler`; flag-gated await-edge write at spawn with parent blocking via `GateRef`; completion observer settle→resume→drain sequence with crash-recovery ordering constraints; composition/runtime wiring; boot-time recovery routine; end-to-end integration test expectations (flag-on delivery, restart delivery, flag-off no-write).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related issues

#4656 — This spec explicitly supersedes the durable gate resolution store from PR #4656, replacing its SQL-backed DurableSubagentGateResolutionStore with a ScopedFilesystem CAS await-edge and eliminating the settlement log, capacity-counter buckets, and gate/result indirections defined there.
#4474 — Both cover the same durable subagent completion-delivery machinery (await-edge/state persistence, terminal-event settlement, parent resume/recovery), making this a direct design successor.

Poem

An edge is opened, a parent waits in place,
CAS transitions guard each file with grace.
Settled, drained — the thread wakes again,
Boot recovery rescues what the crash couldn't chain.
v2_enabled: false keeps the old path clean. 🦀

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	This PR only adds docs and plans; it does not deliver the durable store, migration, or reconciler code required by `#4656`.	Implement the gate-resolution store, capacity counter, migration, and security tests specified in `#4656`, or split the docs into a separate non-closing PR.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is conventional, specific, and accurately summarizes the docs-only subagent thread harness design and PR0/PR1 plan.
Description check	✅ Passed	The description is detailed and covers the design, rollout, and linked issue, though several template sections are left unfilled.
Out of Scope Changes check	✅ Passed	No code or unrelated functional changes were introduced; the diff stays within documentation for the subagent thread harness.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 15

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/plans/2026-06-10-triggered-run-delivery-owner-scope.md`:
- Around line 398-404: The validation steps in this section don’t match the
repo’s required pre-review gate. Update the commands in this plan to mirror the
contribution checklist by using the strict fmt/clippy variants, adding the
required build step, and keeping the full test run; reference the existing “Step
2: Full test suite” block and the surrounding quality-gate commands so they
align with the mandated pre-PR checks.
- Around line 313-323: Remove the contradictory fallback claim from the commit
message text in the triggered-run delivery owner scope plan and keep the
description aligned with the stated scope. Update the wording around the
trigger/thread ownership change so it only says new first-fire bindings persist
the creator as owner and existing Direct-route bindings are not backfilled,
without asserting that any delivery-side fallback covers them. Use the commit
message block in the plan document as the target for the edit.

In `@docs/plans/2026-06-18-triggers-on-rootfilesystem.md`:
- Around line 188-194: The thread identity derivation is inconsistent: one path
includes creator_user_id while other paths already rely on
fire.identity.route_thread_id() using tenant_id, trigger_id, and fire_slot only.
Update the canonical thread_id hashing rules so every use of ensure_thread and
TriggerFireIdentity.route_thread_id() shares the same input set, and remove any
alternate derivation in the trusted-trigger fire path. Make sure all
recovery/idempotency-related lookups use that single canonical hash definition.

In `@docs/plans/2026-06-22-reborn-runtime-hang-investigation.md`:
- Around line 5-7: The incident note in the reborn-runtime-hang-investigation
document exposes a full user UUID, so redact or pseudonymize that identifier
while preserving the troubleshooting context. Update the text in the affected
incident summary to use a masked user reference instead of the raw UUID, keeping
the same details about the tenant and outage but removing the direct user
identifier.

In `@docs/reborn/2026-06-16-memory-how-it-works.md`:
- Around line 13-14: Update the memory search docs so they match the runtime
behavior in dispatch_search: the default Reborn build disables vector search
explicitly via with_vector(false), not because embeddings are missing, and the
defaults are limit 5 with pre_fusion_limit set to max(limit, 20). Adjust the
affected description in memory-how-it-works.md to reflect these actual defaults
and the fact that only FTS runs by default.

In `@docs/superpowers/plans/2026-06-16-persistent-user-context-profile.md`:
- Around line 924-933: The integration test for profile rendering is asserting
the old compact location format, but the runtime now uses sanitized
untrusted-user-data framing for location. Update the assertions in
profile_set_then_runtime_context_renders_local_time to match the current
rendered contract by checking the explicit untrusted-user-data output around
location while still verifying the Asia/Tokyo local time and locale fields,
using the same capability-dispatch and loop-driver host flow to locate the
behavior.
- Around line 758-764: The existing profile merge logic in the retry loop is
swallowing corrupt or non-JSON stored documents by using
serde_json::from_slice(...).unwrap_or_default(), which can overwrite bad data
with an empty object. Update the merge path in this block to fail closed: when
deserialization of current bytes fails, return operation_error() instead of
defaulting, and keep the normal flow only for valid JSON or missing documents.
Use the current/read_document handling and the existing content_bytes_sha256
helper as the anchor points for locating the fix.
- Around line 261-263: The plan currently mixes two profile contracts: one part
says to use only UserProfileContext.timezone with no ResolvedUserProfile, while
later steps still reference ResolvedUserProfile and user_timezone. Normalize the
entire document to a single model by updating all affected steps/snippets to use
UserProfileContext directly, removing any remaining ResolvedUserProfile
mentions, and keeping the LoopRunContext-based trait shape consistent with the
intended load_identity_candidates pattern.

In `@docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md`:
- Around line 278-279: The boot-recovery design is inconsistent between the
recovery driver and the spec, so reconcile it to a single canonical flow. Update
the recovery plan around list_parents_with_unclosed_edges and the boot-recovery
section to match the spec’s active-parent-run enumeration approach, or
explicitly revise the spec if the edge-store scan is intended to be the source
of truth. Make sure the description of TurnStateStore and any recovery
orchestration steps use one consistent driver so there is no split
implementation path.

In `@docs/superpowers/specs/2026-06-16-persistent-user-context-profile-design.md`:
- Around line 82-83: The profile-rendering spec is treating free-text location
as safe inline context, which weakens the prompt-injection boundary. Update the
user profile examples and guidance to frame location as untrusted user-provided
data, with explicit “not instructions” language and quoted rendering instead of
`location=...` style inline context. Keep the timezone behavior unchanged in the
existing time-rendering branch, and make sure the spec wording aligns with the
tested contract for the profile rendering logic.

In `@docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md`:
- Around line 113-114: Update the Subagent Thread Harness design spec so the
awaited edge uses the gate-ref-compatible hyphenated format, not the
colon-separated placeholder. In the section describing the real GateRef, replace
the `gate:subagent-await:<child_run_id>` example with the same naming convention
used by the implementation plan, and keep the wording aligned with the resume
precondition behavior for blocked parents.

In `@scripts/loadtest/parse_concurrency.py`:
- Around line 174-177: The throughput calculation in the concurrency parsing
logic can divide by a zero duration when `max(finished_times) -
min(started_times)` is 0, so update the branch in `parse_concurrency.py` to
mirror the existing duration guard used later in the file. In the throughput
section that uses `n_finished`, `n_started`, `finished_times`, and
`started_times`, compute throughput only when the duration is greater than 0;
otherwise fall back to 0.0 to avoid `ZeroDivisionError`.

In `@scripts/loadtest/stub_llm_server.py`:
- Line 93: Bind the stub LLM server to loopback instead of all interfaces by
updating the ThreadingHTTPServer setup in stub_llm_server.py so the server
listens on 127.0.0.1 for args.port; keep the change localized to the server
initialization path and ensure any related startup or logging messages in
stub_llm_server reflect the loopback-only binding.

In `@scripts/loadtest/sweep.sh`:
- Around line 15-16: The prerequisites comment in sweep.sh is stale because the
load test Python tools are stdlib-only, so remove the mention of httpx and pip
install -r requirements.txt from the header. Update the prerequisite text near
the sweep.sh comment to list only the actual required tools (python3, cargo,
curl) and keep it consistent with the headers in loadgen.py and
stub_llm_server.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: e6626008-9618-4ed8-b4a3-6e32956864fc

📥 Commits

Reviewing files that changed from the base of the PR and between e754655 and f0780ac.

📒 Files selected for processing (18)

docs/plans/2026-06-10-triggered-run-delivery-owner-scope.md
docs/plans/2026-06-14-trusted-trigger-classification-as-type-4851.md
docs/plans/2026-06-15-google-oauth-account-fork-fix.md
docs/plans/2026-06-18-triggers-on-rootfilesystem.md
docs/plans/2026-06-22-reborn-runtime-hang-investigation.md
docs/plans/2026-06-22-turn-state-lock-wedge-fix.md
docs/plans/2026-06-23-github-pat-auth-bugs-1-2.md
docs/reborn/2026-06-08-subagent-durability-spec.md
docs/reborn/2026-06-16-memory-how-it-works.md
docs/superpowers/plans/2026-06-16-persistent-user-context-profile.md
docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md
docs/superpowers/specs/2026-06-16-persistent-user-context-profile-design.md
docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md
scripts/loadtest/loadgen.py
scripts/loadtest/parse_concurrency.py
scripts/loadtest/run_serve.sh
scripts/loadtest/stub_llm_server.py
scripts/loadtest/sweep.sh

coderabbitai · 2026-06-24T05:17:57Z

+git commit -m "fix(conversations): thread trigger creator into trusted binding resolution
+
+Trigger fires previously resolved conversation bindings with
+trusted_owner_user_id=None, so triggered-run TurnScopes carried
+ActorFallback ownership. Slack delivery then reconstructed a thread
+scope with owner None while the thread was stored with Some(actor),
+silently skipping the outbound push. New first-fire bindings now
+persist the creator as owner. Existing Direct-route bindings are
+deliberately not backfilled (scope-equality compat for in-flight
+runs); the delivery-side fallback covers them."
+```


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Remove the fallback claim from the commit message.

Line 322 says “the delivery-side fallback covers them,” but Line 17 explicitly declares that fallback out of scope. This contradiction will mislead implementation/review.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/2026-06-10-triggered-run-delivery-owner-scope.md` around lines 313 - 323, Remove the contradictory fallback claim from the commit message text in the triggered-run delivery owner scope plan and keep the description aligned with the stated scope. Update the wording around the trigger/thread ownership change so it only says new first-fire bindings persist the creator as owner and existing Direct-route bindings are not backfilled, without asserting that any delivery-side fallback covers them. Use the commit message block in the plan document as the target for the edit.

coderabbitai · 2026-06-24T05:17:57Z

+Run: `cargo fmt && cargo clippy --all --benches --tests --examples --all-features 2>&1 | tail -5 && cargo test -p ironclaw_conversations -p ironclaw_reborn_composition -p ironclaw_turns -p ironclaw_triggers --features slack-v2-host-beta`
+Expected: fmt clean, clippy zero warnings, all tests pass (`ironclaw_turns` included because the fix consumes `TurnScope::new_with_owner`)
+
+- [ ] **Step 2: Full test suite**
+
+Run: `cargo test`
+Expected: PASS (catches any workspace-wide fallout from the `TrustedInboundTurnRequest` signature change — the type is `pub(crate)`, so fallout should be zero outside `ironclaw_conversations`)


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Align the quality gate commands with repo-required pre-PR checks.

This block skips the required strict flags and cargo build listed in repo contribution rules; it should mirror the mandated pre-review gate command set.

As per coding guidelines: “Run the local validation checks required before requesting a review… cargo fmt --all -- --check, cargo clippy ... -- -D warnings, cargo build, cargo test.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/2026-06-10-triggered-run-delivery-owner-scope.md` around lines 398 - 404, The validation steps in this section don’t match the repo’s required pre-review gate. Update the commands in this plan to mirror the contribution checklist by using the strict fmt/clippy variants, adding the required build step, and keeping the full test run; reference the existing “Step 2: Full test suite” block and the surrounding quality-gate commands so they align with the mandated pre-PR checks.

Source: Coding guidelines

coderabbitai · 2026-06-24T05:17:57Z

+The trusted-trigger fire path supplies a deterministic `thread_id` derived from
+`TriggerFireIdentity` to `ensure_thread`
+(`crates/ironclaw_threads/src/contract.rs:300`,
+`filesystem_service.rs:697-724`, `CasExpectation::Absent`), instead of the binding
+layer minting a random UUID. The canonical `thread_id` for a fire is
+`H(tenant_id, creator_user_id, trigger_id, fire_slot)`.
+


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Use one canonical hash input set for thread identity.

Line 193 adds creator_user_id to thread_id derivation, but Lines 84-89 and 385-387 rely on fire.identity.route_thread_id() (tenant/trigger/fire_slot). Mixed derivation rules will break idempotent lookup/recovery.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/2026-06-18-triggers-on-rootfilesystem.md` around lines 188 - 194, The thread identity derivation is inconsistent: one path includes creator_user_id while other paths already rely on fire.identity.route_thread_id() using tenant_id, trigger_id, and fire_slot only. Update the canonical thread_id hashing rules so every use of ensure_thread and TriggerFireIdentity.route_thread_id() shares the same input set, and remove any alternate derivation in the trusted-trigger fire path. Make sure all recovery/idempotency-related lookups use that single canonical hash definition.

coderabbitai · 2026-06-24T05:17:57Z

+A local-dev `ironclaw-reborn serve` instance (tenant `reborn-cli`, single user
+`e3aa23b0-e23e-4e1f-9b14-b8954e543200`) became **completely unresponsive to all
+users for ~30 minutes**. The user has no triggers of their own, but other users'


🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Redact direct user identifiers in incident docs.

Line 6 publishes a full user UUID. Keep incident utility, but mask or pseudonymize user identifiers to avoid persisting sensitive identifiers in repo history.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/2026-06-22-reborn-runtime-hang-investigation.md` around lines 5 - 7, The incident note in the reborn-runtime-hang-investigation document exposes a full user UUID, so redact or pseudonymize that identifier while preserving the troubleshooting context. Update the text in the affected incident summary to use a masked user reference instead of the raw UUID, keeping the same details about the tenant and outage but removing the direct user identifier.

coderabbitai · 2026-06-24T05:17:57Z

+- Search is **hybrid FTS + vector with RRF fusion**, but in the default Reborn build **no embedding provider is wired**, so vector search is inactive and only FTS runs.
+


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Search behavior docs are out of sync with runtime implementation.

Line 13 and Lines 179-182 currently describe vector as disabled due to missing embeddings and list defaults (limit 20, pre_fusion_limit 50, both branches on). In crates/ironclaw_host_runtime/src/first_party_tools/memory.rs (dispatch_search), vector is explicitly disabled via .with_vector(false), default limit is 5, and pre_fusion_limit is max(limit, 20).

Suggested doc correction

-- Search is **hybrid FTS + vector with RRF fusion**, but in the default Reborn build **no embedding provider is wired**, so vector search is inactive and only FTS runs. +- Search infrastructure supports **hybrid FTS + vector with RRF fusion**, but the current runtime path (`dispatch_search`) explicitly sets `.with_vector(false)`, so only FTS executes today. ... -`MemorySearchResult` = `{ path, score, snippet, full_text_rank, vector_rank }`. Search defaults: `limit` 20 (capped per capability schema to ≤20), `pre_fusion_limit` 50, both branches on, RRF k=60. +`MemorySearchResult` = `{ path, score, snippet, full_text_rank, vector_rank }`. In the current runtime capability path, default `limit` is 5 (clamped to 1..20), `pre_fusion_limit` is `max(limit, 20)`, and vector is disabled for this call path. -**Caveat:** the default Reborn `build_backend` wiring passes **no embedding provider**, so the vector branch is inert; only FTS contributes. Hybrid search is built but runs single-modality until an embedding provider is wired at the host boundary. +**Caveat:** hybrid search exists in the backend, but this capability path currently forces FTS-only by request construction (`with_vector(false)`).

Also applies to: 179-182

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reborn/2026-06-16-memory-how-it-works.md` around lines 13 - 14, Update the memory search docs so they match the runtime behavior in dispatch_search: the default Reborn build disables vector search explicitly via with_vector(false), not because embeddings are missing, and the defaults are limit 5 with pre_fusion_limit set to max(limit, 20). Adjust the affected description in memory-how-it-works.md to reflect these actual defaults and the fact that only FTS runs by default.

coderabbitai · 2026-06-24T05:17:57Z

+**real** `GateRef` that names the edge (e.g. `gate:subagent-await:<child_run_id>`)
+— not a synthetic placeholder, but a genuine reference to the awaited edge — so


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

GateRef format conflicts with the PR plan and can break resume preconditions.

Line 113 uses gate:subagent-await:<child_run_id>, but the implementation plan requires gate:subagent-await-<child_run_id> (hyphen suffix) for gate-ref compatibility. This mismatch can cause blocked-parent resume to miss its expected gate key.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md` around lines 113 - 114, Update the Subagent Thread Harness design spec so the awaited edge uses the gate-ref-compatible hyphenated format, not the colon-separated placeholder. In the section describing the real GateRef, replace the `gate:subagent-await:<child_run_id>` example with the same naming convention used by the implementation plan, and keep the wording aligned with the resume precondition behavior for blocked parents.

coderabbitai · 2026-06-24T05:17:57Z

+**Path identity & access patterns.** The edge is keyed by
+`(parent_run_id, child_run_id)`. Two access patterns, both cheap:
+- **Live settle:** a child knows its own `parent_run_id` (lineage), so the edge
+  is a direct path lookup — no by-child index.
+- **Parent drain:** the parent lists its own directory
+  `/await-edges/<parent_run_id>/` (single-level `list_dir`).
+


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Await-edge directory path is inconsistent within this spec.

Line 219 defines /turns/subagent-await-edges/..., but Line 249 says parent drain lists /await-edges/<parent_run_id>/. Please normalize to one canonical path; otherwise store/list implementations will diverge.

coderabbitai · 2026-06-24T05:17:57Z

+    if n_finished > 1 and n_started > 0:
+        throughput = n_finished / (max(finished_times) - min(started_times))
+    else:
+        throughput = 0.0


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Guard against zero-duration division.

max(finished_times) - min(started_times) can be 0 (single instantaneous run, or all timestamps colliding at parser resolution), raising ZeroDivisionError. The bucket path at Line 185 already guards duration > 0; mirror it here.

Proposed fix

- if n_finished > 1 and n_started > 0: - throughput = n_finished / (max(finished_times) - min(started_times)) - else: - throughput = 0.0 + span = max(finished_times) - min(started_times) if (n_finished > 1 and n_started > 0) else 0.0 + throughput = n_finished / span if span > 0 else 0.0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if n_finished > 1 and n_started > 0:

throughput = n_finished / (max(finished_times) - min(started_times))

else:

throughput = 0.0

span = max(finished_times) - min(started_times) if (n_finished > 1 and n_started > 0) else 0.0

throughput = n_finished / span if span > 0 else 0.0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/loadtest/parse_concurrency.py` around lines 174 - 177, The throughput calculation in the concurrency parsing logic can divide by a zero duration when `max(finished_times) - min(started_times)` is 0, so update the branch in `parse_concurrency.py` to mirror the existing duration guard used later in the file. In the throughput section that uses `n_finished`, `n_started`, `finished_times`, and `started_times`, compute throughput only when the duration is greater than 0; otherwise fall back to 0.0 to avoid `ZeroDivisionError`.

coderabbitai · 2026-06-24T05:17:58Z

+    latency_s = args.latency_ms / 1000.0
+    handler = make_handler(latency_s)
+
+    server = ThreadingHTTPServer(("0.0.0.0", args.port), handler)


🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

Bind the stub to loopback, not 0.0.0.0.

The harness already pins the real server to 127.0.0.1 (run_serve.sh Line 120) and the stub URL is http://127.0.0.1:9999/v1. Binding to all interfaces needlessly exposes the canned-response endpoint on the LAN with no auth.

Proposed fix

- server = ThreadingHTTPServer(("0.0.0.0", args.port), handler) + server = ThreadingHTTPServer(("127.0.0.1", args.port), handler)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

server = ThreadingHTTPServer(("0.0.0.0", args.port), handler)

server = ThreadingHTTPServer(("127.0.0.1", args.port), handler)

🧰 Tools

🪛 Ruff (0.15.18)

[error] 93-93: Possible binding to all interfaces

(S104)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/loadtest/stub_llm_server.py` at line 93, Bind the stub LLM server to loopback instead of all interfaces by updating the ThreadingHTTPServer setup in stub_llm_server.py so the server listens on 127.0.0.1 for args.port; keep the change localized to the server initialization path and ensure any related startup or logging messages in stub_llm_server reflect the loopback-only binding.

coderabbitai · 2026-06-24T05:17:58Z

+# Prerequisites: python3, httpx installed (pip install -r requirements.txt),
+#                cargo in PATH, curl in PATH.


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Stale prerequisite: no httpx/requirements.txt needed.

Both Python tools state stdlib-only (loadgen.py Line 14, stub_llm_server.py header), so httpx installed (pip install -r requirements.txt) is wrong and sends users on a dead-end setup. Drop it.

Proposed fix

-# Prerequisites: python3, httpx installed (pip install -r requirements.txt), -# cargo in PATH, curl in PATH. +# Prerequisites: python3 (stdlib only), cargo in PATH, curl in PATH.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Prerequisites: python3, httpx installed (pip install -r requirements.txt),

# cargo in PATH, curl in PATH.

# Prerequisites: python3 (stdlib only), cargo in PATH, curl in PATH.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/loadtest/sweep.sh` around lines 15 - 16, The prerequisites comment in sweep.sh is stale because the load test Python tools are stdlib-only, so remove the mention of httpx and pip install -r requirements.txt from the header. Update the prerequisite text near the sweep.sh comment to list only the actual required tools (python3, cargo, curl) and keep it consistent with the headers in loadgen.py and stub_llm_server.py.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f0780accb7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-24T05:18:01Z

+if ! curl -sf "http://127.0.0.1:9999/v1/models" >/dev/null 2>&1; then
+    echo "ERROR: stub LLM server did not start" >&2
+    exit 1


Fail when the stub port is already occupied

When port 9999 is already held by a previous stub or another OpenAI-compatible service that answers /v1/models, the newly spawned stub_llm_server.py exits with EADDRINUSE but this readiness curl succeeds against the old process. The sweep then records a dead STUB_PID and runs against whatever latency/model behavior the stale server has, corrupting the throughput results; check that the spawned PID is still alive and/or fail on an occupied port before accepting readiness.

Useful? React with 👍 / 👎.

railway-app · 2026-06-24T05:23:48Z

🚅 Deployed to the ironclaw-pr-5176 environment in ironclaw-ci-preview

Service	Status	Web	Updated (UTC)
ironclaw	✅ Success (View Logs)	Web	Jun 24, 2026 at 6:13 am

Foundational design + bite-sized TDD plan for the Reborn subagent thread harness: subagents become first-class, addressable, resumable threads (parent and human can attach), with durable delivery via a single ScopedFilesystem await-edge per parent-waits-on-child (no SQL gate stack). Design hardened over 3 adversarial review rounds; PR0+PR1 plan converged over 7. Removes the superseded 2026-06-08 subagent-durability sub-spec (replaced by this design). Other subagent docs (spawn mechanics, compaction, planner) describe live shipped code and are intentionally kept. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md`:
- Around line 684-689: The await-edge abstraction is being defined in the
higher-level reborn crate while SubagentSpawnDeps lives in loop_support, which
creates a reverse dependency risk or cycle. Move AwaitEdgeWriter into a
lower/shared crate alongside SubagentSpawnDeps, then have
FilesystemAwaitEdgeStore in await_edge.rs implement that trait and keep the
optional await_edge_store field typed against the shared trait. Update the
SubagentSpawnDeps wiring and any finish_spawn call sites to use the relocated
trait without introducing a reborn dependency back into loop_support.
- Around line 690-691: The runtime path reference in the Task 1.6c note is
inconsistent with the rest of the plan and points to the wrong crate. Update the
production-site reference from ironclaw_reborn/src/runtime.rs to
ironclaw_reborn_composition/src/runtime.rs, and keep the surrounding
SubagentSpawnDeps guidance unchanged so contributors implement the wiring in the
correct runtime module.

In `@docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md`:
- Around line 257-260: The boot recovery description conflicts with the PR1
await-edge store contract by relying on active parent runs instead of the
store-driven recovery entrypoint. Update the recovery spec around the boot
algorithm to use list_parents_with_unclosed_edges() as the source of truth, then
for each returned parent inspect its edge records and reconcile terminal or
abandoned children via get_run_state(child) and edge closure logic. Keep the
wording aligned with the existing await-edge store symbols so the boot flow is
clearly derived from the store contract rather than a separate active-run query.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: f075e771-bba5-4c30-8d34-24b0c63d4cc2

📥 Commits

Reviewing files that changed from the base of the PR and between f0780ac and 613a307.

📒 Files selected for processing (3)

docs/reborn/2026-06-08-subagent-durability-spec.md
docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md
docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md

coderabbitai · 2026-06-24T06:08:15Z

+- Consumes: `FilesystemAwaitEdgeStore::write_open` + `AwaitEdge` (Tasks 1.1/1.2), `await_edge_gate_ref` (Task 1.1), `RebornRuntimeInput.subagent_v2_enabled` threaded into `SubagentSpawnDeps` (add a `subagent_v2_enabled: bool` field + optional `await_edge_store: Option<Arc<dyn AwaitEdgeWriter>>` where `AwaitEdgeWriter` is a tiny object-safe trait wrapping `write_open` — keeps `SubagentSpawnDeps` non-generic).
+- Produces: when `subagent_v2_enabled`, `finish_spawn` writes an `open` await-edge (in addition to the existing `record_awaited_child`) and parks the parent in `BlockedDependentRun` with `BlockedReason::AwaitDependentRun { gate_ref: await_edge_gate_ref(child_run_id) }`.
+
+- [ ] **Step 1: Read** `finish_spawn` (~847) and how the parent is currently blocked (`completion_observer.rs:3050` sets `BlockedDependentRun`). Note the exact block-construction call.
+- [ ] **Step 2: Define `AwaitEdgeWriter`** (object-safe) in `await_edge.rs`: `#[async_trait] pub trait AwaitEdgeWriter: Send + Sync { async fn write_open(&self, scope: &TurnScope, parent: &TurnRunId, child: &TurnRunId, edge: &AwaitEdge) -> Result<(), AwaitEdgeError>; }` and impl it for `FilesystemAwaitEdgeStore<F>`. (This thin erasure trait exists solely to keep `SubagentSpawnDeps` non-generic — add a one-line comment saying so.) Add `subagent_v2_enabled: bool` and `await_edge_store: Option<Arc<dyn AwaitEdgeWriter>>` to `SubagentSpawnDeps`.
+


📐 Maintainability & Code Quality | 🟠 Major | 🏗️ Heavy lift

Avoid a reverse crate dependency for AwaitEdgeWriter.

SubagentSpawnDeps is in crates/ironclaw_loop_support, but Line 688 defines AwaitEdgeWriter in crates/ironclaw_reborn/src/subagent/await_edge.rs and then threads it into SubagentSpawnDeps. That implies loop_support depends on reborn (or creates a cycle). Put the trait in loop_support (or another lower/shared crate) and implement it in reborn.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md` around lines 684 - 689, The await-edge abstraction is being defined in the higher-level reborn crate while SubagentSpawnDeps lives in loop_support, which creates a reverse dependency risk or cycle. Move AwaitEdgeWriter into a lower/shared crate alongside SubagentSpawnDeps, then have FilesystemAwaitEdgeStore in await_edge.rs implement that trait and keep the optional await_edge_store field typed against the shared trait. Update the SubagentSpawnDeps wiring and any finish_spawn call sites to use the relocated trait without introducing a reborn dependency back into loop_support.

coderabbitai · 2026-06-24T06:08:15Z

+  **Architecture-rule compliance (B2):** `Option<Arc<dyn AwaitEdgeWriter>>` is a flag-gated optional dependency, which `.claude/rules/architecture.md` rule 2 flags. Annotate the field: `// arch-exempt: optional_arc — flag-gated v2 path, set iff subagent.v2_enabled; removed when v2 becomes default (plan 2026-06-23 cleanup PR)`. **Ripple:** `SubagentSpawnDeps` is a struct-literal with ~15 test construction sites + 1 production site (`ironclaw_reborn/src/runtime.rs:556`). Every site must add `subagent_v2_enabled: false, await_edge_store: None,`. Enumerate them first: `rg -ln "SubagentSpawnDeps" crates/` then grep each file for the struct-literal construction (rustfmt may break `SubagentSpawnDeps {` across lines, so don't rely on a brace-adjacent pattern); update each in this task (mechanical two-field additions: `subagent_v2_enabled: false, await_edge_store: None,`). The production site (`ironclaw_reborn/src/runtime.rs:556`) is wired for real in Task 1.6c.
+- [ ] **Step 3: Write the failing test** — spawn with `subagent_v2_enabled=true` + a captured in-memory `AwaitEdgeWriter`; assert an `open` edge is written for the child. With `false`, assert none.


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Fix the runtime path inconsistency in the task instructions.

Line 690 points to crates/ironclaw_reborn/src/runtime.rs:556, but this plan’s file map and Task 1.6c both target crates/ironclaw_reborn_composition/src/runtime.rs. This mismatch can send implementation to the wrong crate.

Suggested doc fix

-- The production site (`ironclaw_reborn/src/runtime.rs:556`) is wired for real in Task 1.6c. +- The production site (`ironclaw_reborn_composition/src/runtime.rs`) is wired for real in Task 1.6c.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

**Architecture-rule compliance (B2):** `Option<Arc<dyn AwaitEdgeWriter>>` is a flag-gated optional dependency, which `.claude/rules/architecture.md` rule 2 flags. Annotate the field: `// arch-exempt: optional_arc — flag-gated v2 path, set iff subagent.v2_enabled; removed when v2 becomes default (plan 2026-06-23 cleanup PR)`. **Ripple:** `SubagentSpawnDeps` is a struct-literal with ~15 test construction sites + 1 production site (`ironclaw_reborn/src/runtime.rs:556`). Every site must add `subagent_v2_enabled: false, await_edge_store: None,`. Enumerate them first: `rg -ln "SubagentSpawnDeps" crates/` then grep each file for the struct-literal construction (rustfmt may break `SubagentSpawnDeps {` across lines, so don't rely on a brace-adjacent pattern); update each in this task (mechanical two-field additions: `subagent_v2_enabled: false, await_edge_store: None,`). The production site (`ironclaw_reborn/src/runtime.rs:556`) is wired for real in Task 1.6c.

- [ ] **Step 3: Write the failing test** — spawn with `subagent_v2_enabled=true` + a captured in-memory `AwaitEdgeWriter`; assert an `open` edge is written for the child. With `false`, assert none.

**Architecture-rule compliance (B2):** `Option<Arc<dyn AwaitEdgeWriter>>` is a flag-gated optional dependency, which `.claude/rules/architecture.md` rule 2 flags. Annotate the field: `// arch-exempt: optional_arc — flag-gated v2 path, set iff subagent.v2_enabled; removed when v2 becomes default (plan 2026-06-23 cleanup PR)`. **Ripple:** `SubagentSpawnDeps` is a struct-literal with ~15 test construction sites + 1 production site (`ironclaw_reborn/src/runtime.rs:556`). Every site must add `subagent_v2_enabled: false, await_edge_store: None,`. Enumerate them first: `rg -ln "SubagentSpawnDeps" crates/` then grep each file for the struct-literal construction (rustfmt may break `SubagentSpawnDeps {` across lines, so don't rely on a brace-adjacent pattern); update each in this task (mechanical two-field additions: `subagent_v2_enabled: false, await_edge_store: None,`). The production site (`ironclaw_reborn_composition/src/runtime.rs`) is wired for real in Task 1.6c.

- [ ] **Step 3: Write the failing test** — spawn with `subagent_v2_enabled=true` + a captured in-memory `AwaitEdgeWriter`; assert an `open` edge is written for the child. With `false`, assert none.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/superpowers/plans/2026-06-23-subagent-thread-harness-pr0-pr1.md` around lines 690 - 691, The runtime path reference in the Task 1.6c note is inconsistent with the rest of the plan and points to the wrong crate. Update the production-site reference from ironclaw_reborn/src/runtime.rs to ironclaw_reborn_composition/src/runtime.rs, and keep the surrounding SubagentSpawnDeps guidance unchanged so contributors implement the wiring in the correct runtime module.

coderabbitai · 2026-06-24T06:08:16Z

+- *Boot:* enumerate active (non-terminal) parent runs; for each, `list_dir` its
+  edge directory; for each open edge, `get_run_state(child)`; settle the terminal
+  ones, close `abandoned` ones whose parent run is terminal. This is driven by
+  active runs — **no global recursive scan**.


🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Boot recovery algorithm conflicts with the PR1 store contract.

Line 257 says boot recovery enumerates active parent runs, but the PR1 contract defines recovery from list_parents_with_unclosed_edges() in the await-edge store (no active-run query dependency). This mismatch can cause implementation drift and missed recovery cases.

Suggested doc fix

- - *Boot:* enumerate active (non-terminal) parent runs; for each, `list_dir` its - edge directory; for each open edge, `get_run_state(child)`; settle the terminal - ones, close `abandoned` ones whose parent run is terminal. This is driven by - active runs — **no global recursive scan**. + - *Boot:* enumerate parents with unclosed await-edges from the await-edge store + (e.g., `list_parents_with_unclosed_edges`), then `list_dir` each parent edge + directory; for each open edge, `get_run_state(child)`; settle terminal ones, + and close `abandoned` when parent is terminal. This is driven by the edge + store — **no global recursive scan**.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- *Boot:* enumerate active (non-terminal) parent runs; for each, `list_dir` its

edge directory; for each open edge, `get_run_state(child)`; settle the terminal

ones, close `abandoned` ones whose parent run is terminal. This is driven by

active runs — **no global recursive scan**.

- *Boot:* enumerate parents with unclosed await-edges from the await-edge store

(e.g., `list_parents_with_unclosed_edges`), then `list_dir` each parent edge

directory; for each open edge, `get_run_state(child)`; settle terminal ones,

and close `abandoned` when parent is terminal. This is driven by the edge

store — **no global recursive scan**.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/superpowers/specs/2026-06-23-subagent-thread-harness-design.md` around lines 257 - 260, The boot recovery description conflicts with the PR1 await-edge store contract by relying on active parent runs instead of the store-driven recovery entrypoint. Update the recovery spec around the boot algorithm to use list_parents_with_unclosed_edges() as the source of truth, then for each returned parent inspect its edge records and reconcile terminal or abandoned children via get_run_state(child) and edge closure logic. Keep the wording aligned with the existing await-edge store symbols so the boot flow is clearly derived from the store contract rather than a separate active-run query.

railway-app Bot temporarily deployed to ironclaw-ci-preview / ironclaw-pr-5176 June 24, 2026 05:10 Destroyed

github-actions Bot added scope: docs Documentation size: XL 500+ changed lines risk: low Changes to docs, tests, or low-risk modules labels Jun 24, 2026

github-actions Bot added the contributor: core 20+ merged PRs label Jun 24, 2026

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 24, 2026

View reviewed changes

henrypark133 force-pushed the docs/subagent-thread-harness-design branch from f0780ac to 613a307 Compare June 24, 2026 06:01

railway-app Bot deployed to ironclaw-ci-preview / ironclaw-pr-5176 June 24, 2026 06:01 View deployment

github-actions Bot added size: XS < 10 changed lines (excluding docs) and removed size: XL 500+ changed lines labels Jun 24, 2026

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

		- Search is hybrid FTS + vector with RRF fusion, but in the default Reborn build no embedding provider is wired, so vector search is inactive and only FTS runs.

		real `GateRef` that names the edge (e.g. `gate:subagent-await:<child_run_id>`)
		— not a synthetic placeholder, but a genuine reference to the awaited edge — so

	server = ThreadingHTTPServer(("0.0.0.0", args.port), handler)
	server = ThreadingHTTPServer(("127.0.0.1", args.port), handler)

		# Prerequisites: python3, httpx installed (pip install -r requirements.txt),
		# cargo in PATH, curl in PATH.

	# Prerequisites: python3, httpx installed (pip install -r requirements.txt),
	# cargo in PATH, curl in PATH.
	# Prerequisites: python3 (stdlib only), cargo in PATH, curl in PATH.

		Architecture-rule compliance (B2): `Option<Arc<dyn AwaitEdgeWriter>>` is a flag-gated optional dependency, which `.claude/rules/architecture.md` rule 2 flags. Annotate the field: `// arch-exempt: optional_arc — flag-gated v2 path, set iff subagent.v2_enabled; removed when v2 becomes default (plan 2026-06-23 cleanup PR)`. Ripple: `SubagentSpawnDeps` is a struct-literal with ~15 test construction sites + 1 production site (`ironclaw_reborn/src/runtime.rs:556`). Every site must add `subagent_v2_enabled: false, await_edge_store: None,`. Enumerate them first: `rg -ln "SubagentSpawnDeps" crates/` then grep each file for the struct-literal construction (rustfmt may break `SubagentSpawnDeps {` across lines, so don't rely on a brace-adjacent pattern); update each in this task (mechanical two-field additions: `subagent_v2_enabled: false, await_edge_store: None,`). The production site (`ironclaw_reborn/src/runtime.rs:556`) is wired for real in Task 1.6c.
		- [ ] Step 3: Write the failing test — spawn with `subagent_v2_enabled=true` + a captured in-memory `AwaitEdgeWriter`; assert an `open` edge is written for the child. With `false`, assert none.

-- *Boot:* enumerate active (non-terminal) parent runs; for each, `list_dir` its
-  edge directory; for each open edge, `get_run_state(child)`; settle the terminal
-  ones, close `abandoned` ones whose parent run is terminal. This is driven by
-  active runs — **no global recursive scan**.
+- *Boot:* enumerate parents with unclosed await-edges from the await-edge store
+  (e.g., `list_parents_with_unclosed_edges`), then `list_dir` each parent edge
+  directory; for each open edge, `get_run_state(child)`; settle terminal ones,
+  and close `abandoned` when parent is terminal. This is driven by the edge
+  store — **no global recursive scan**.

Uh oh!

Conversation

henrypark133 commented Jun 24, 2026

Summary

What's here

Key design decisions

Rollout

Uh oh!

gemini-code-assist Bot commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

railway-app Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

railway-app Bot commented Jun 24, 2026 •

edited

Loading