Scope
Implement the first adapter pass from Auto Browser traces/results into the evidence requirements tracked in benchmarks/cuaverifier/stage1-manifest.json and benchmarks/online_mind2web/stage1-manifest.json.
Why it matters
The long-run product wedge is verifiable browser agency, not just browser control. External verifier/eval lanes need a clear artifact boundary.
Done when
- Trace/result fields needed by the verifier lanes are identified.
- A small adapter or design doc maps Auto Browser artifacts into those fields.
- The lane remains unscored until source revisions and deterministic subsets are pinned.
Scope
Implement the first adapter pass from Auto Browser traces/results into the evidence requirements tracked in
benchmarks/cuaverifier/stage1-manifest.jsonandbenchmarks/online_mind2web/stage1-manifest.json.Why it matters
The long-run product wedge is verifiable browser agency, not just browser control. External verifier/eval lanes need a clear artifact boundary.
Done when