Status: Complete Owner: Core maintainers Last updated: 2026-01-03
This document is the single source of truth for delivering the BAQT super-framework to v1.0. All work must map to this plan. Anything not listed here is out of scope for v1.0.
In scope:
- Fully automated execution with blocking human-in-the-loop (HITL) gates.
- Pre-development phases default to manual; optional CLI menu enables automation with HITL when approved.
- End-to-end execution of real BMAD workflows with correct artifacts.
- TELIS token efficiency controls and validation gates.
- QUINT evidence and DRR workflows with auditable trails.
- Provider routing across Ollama, LiteLLM, OpenAI, Claude, Gemini, Groq.
- CLI-first approval and control.
Out of scope for v1.0:
- GUI or web-based approvals.
- Long-running distributed execution across multiple hosts.
- Multi-tenant user management.
- Full IDE plugin deployment and installation automation beyond CLI.
Change control:
- Any new requirement must be added here with a linked justification and an owner.
- No implementation work begins without a checklist item.
- CHANGELOG.md must be updated for any non-doc change; enforce via pre-commit gate.
- BMAD-METHOD (workflows, agents, output conventions)
- TELIS (LSP, shards, progressive negotiation, validation gates)
- QUINT (ADI cycle, evidence levels, DRR, WLNK, congruence, decay)
- docs/a-practical-guide-to-building-agents.md (model-tool-instructions, guardrails, HITL triggers)
- docs/traceability-audit.md (requirement mapping and evidence)
- At least one BMAD workflow executes end-to-end and produces required outputs and templates.
- HITL gates block on planning, architecture, and release stages and can be approved and resumed by CLI.
- TELIS policy is enforced: LSP symbiosis, shards, progressive negotiation, validation gates.
- QUINT evidence and DRR records are generated and linked to decisions.
- Provider routing is robust with retries, backoff, and error normalization.
- Mapping and registry generation excludes sample/reference workflows and lists explicit outputs/templates only.
- CI passes with coverage gate >= 85 percent and integration tests.
Status values: todo, in-progress, done, partial, missing, blocked
Full traceability with evidence is maintained in docs/traceability-audit.md. This matrix must stay in sync.
| ID | Source | Requirement | Owner | Status | Acceptance Evidence |
|---|---|---|---|---|---|
| REQ-BMAD-001 | BMAD | Preserve BMAD workflows and naming without rewriting logic | Runtime | done | tests/test_runtime_bmad_execution.py |
| REQ-BMAD-002 | BMAD | Sample/reference workflows excluded from production mapping | Mapping | done | tests/test_mapping.py |
| REQ-BMAD-003 | BMAD | Only explicit outputs/templates listed in mapping | Mapping | done | tests/test_mapping.py |
| REQ-BMAD-004 | BMAD | Parse workflow definitions (md/yaml/xml) into steps | Runtime | done | tests/test_workflow_parser.py |
| REQ-BMAD-005 | BMAD | Enforce output folder/layout conventions | Runtime | done | runtime/engine.py, tests/test_runtime_engine.py |
| REQ-BMAD-006 | BMAD | Orchestrator routes workflows and agents | Runtime | done | runtime/orchestrator/engine.py, runtime/orchestrator/router.py, tests/test_orchestrator_engine.py |
| REQ-BMAD-007 | BMAD | Support all BMAD modules (core, BMM, BMB, CIS, BMGD) | Runtime | done | tests/test_runtime_bmad_execution.py |
| REQ-INSTALL-001 | Distribution | Single-command install for full framework (BMAD + QUINT + BAQT) | Release | done | installer/core.py, installer/cli.py, installer/verify.py, bin/baqt.js, docs/installation-guide.md |
| REQ-TELIS-001 | TELIS | LSP symbiosis for type/signature accuracy | Tools | done | runtime/tools/lsp.py, runtime/telis/context.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py |
| REQ-TELIS-002 | TELIS | LSP fallback to shards on failure | Tools | done | runtime/telis/context.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py |
| REQ-TELIS-003 | TELIS | Tiered knowledge shards with token budgets | TELIS | done | runtime/telis/shards.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py |
| REQ-TELIS-004 | TELIS | Progressive context negotiation protocol | TELIS | done | runtime/telis/negotiation.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py |
| REQ-TELIS-005 | TELIS | AST/type/lint validation pipeline | TELIS | done | runtime/tools/validation.py, runtime/engine.py, tests/test_runtime_validation.py, tests/test_runtime_engine.py |
| REQ-TELIS-006 | TELIS | Behavioral cache with TTL and invalidation | TELIS | done | runtime/telis/cache.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py |
| REQ-TELIS-007 | TELIS | Validation failures trigger retry/escalation | TELIS | done | runtime/engine.py, tests/test_runtime_engine.py |
| REQ-QUINT-001 | QUINT | Evidence store for L0/L1/L2 and invalid | QUINT | done | runtime/quint/store.py, tests/test_runtime_quint_evidence.py |
| REQ-QUINT-002 | QUINT | ADI cycle with promotion rules | QUINT | done | runtime/quint/adi.py, tests/test_runtime_quint_adi.py |
| REQ-QUINT-003 | QUINT | WLNK assurance scoring | QUINT | done | runtime/quint/assurance.py, tests/test_runtime_quint_assurance.py |
| REQ-QUINT-004 | QUINT | Congruence scoring for external evidence | QUINT | done | runtime/quint/assurance.py, tests/test_runtime_quint_assurance.py |
| REQ-QUINT-005 | QUINT | Evidence decay and revalidation | QUINT | done | runtime/quint/decay.py, tests/test_runtime_quint_decay.py |
| REQ-QUINT-006 | QUINT | DRR generation for major decisions | QUINT | done | runtime/quint/drr.py, tests/test_runtime_quint_drr.py |
| REQ-QUINT-007 | QUINT | Surface vs grounding separation | QUINT | done | runtime/quint/drr.py, runtime/storage.py, tests/test_quint_surface_grounding.py |
| REQ-QUINT-008 | QUINT | Bounded context snapshot and drift detection | QUINT | done | runtime/quint/snapshot.py, runtime/quint/drift.py, runtime/engine.py, tests/test_quint_snapshot_drift.py |
| REQ-AGENT-001 | Practical Guide | Model/tool/instructions triad per agent | Runtime | done | runtime/agents.py, runtime/execution.py, tests/test_agents_registry.py |
| REQ-AGENT-002 | Practical Guide | Standardized tool definitions and reuse | Runtime | done | runtime/tools/registry.py, runtime/tools/pipeline.py, tests/test_tools_registry.py |
| REQ-AGENT-003 | Practical Guide | Tool risk ratings and safeguards | Runtime | done | runtime/tools/base.py, runtime/tools/pipeline.py, runtime/engine.py, runtime/guardrails/checks.py, tests/test_runtime_guardrails.py |
| REQ-AGENT-004 | Practical Guide | PII filter and data privacy guardrails | Runtime | done | runtime/guardrails/checks.py, runtime/engine.py, tests/test_runtime_guardrails.py, tests/test_runtime_engine.py |
| REQ-AGENT-005 | Practical Guide | Moderation filters for unsafe inputs | Runtime | done | runtime/guardrails/checks.py, runtime/engine.py, tests/test_runtime_guardrails.py |
| REQ-AGENT-006 | Practical Guide | Rules-based protections (blocklists/regex) | Runtime | done | runtime/guardrails/checks.py, runtime/engine.py, tests/test_runtime_guardrails.py |
| REQ-AGENT-007 | Practical Guide | HITL on high-risk actions and retry thresholds | Runtime | done | runtime/gates.py, runtime/engine.py, runtime/tools/pipeline.py, config/runtime.yaml, tests/test_runtime_gates.py |
| REQ-AGENT-008 | Practical Guide | Optimistic execution with concurrent guardrails | Runtime | done | runtime/guardrails/concurrent.py, runtime/engine.py, tests/test_guardrails_concurrent.py |
| REQ-SPEC-001 | Unified Spec | Event bus for workflow state transitions | Runtime | done | runtime/events/bus.py, runtime/events/handlers.py, runtime/engine.py, tests/test_events_bus.py |
| REQ-SPEC-002 | Unified Spec | State store for workflow progress and artifacts | Runtime | done | runtime/engine.py, runtime/storage.py, tests/test_runtime_artifacts_events.py |
| REQ-SPEC-003 | Unified Spec | TELIS policy engine and context manager | TELIS | done | runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py |
| REQ-SPEC-004 | Unified Spec | Evidence store for Quint claims and DRRs | QUINT | done | runtime/quint/store.py, runtime/quint/drr.py, tests/test_runtime_quint_evidence.py, tests/test_runtime_quint_drr.py |
| REQ-SPEC-005 | Unified Spec | Control plane/data plane split | Runtime | done | runtime/plugins/control_plane.py, runtime/plugins/data_plane.py, runtime/plugins/manager.py, tests/test_plugins_control_data.py |
| REQ-SPEC-006 | Unified Spec | Plugin pipeline for policy/adapters/observability | Runtime | done | runtime/plugins/implementations, runtime/logging/emitter.py, tests/test_plugins_control_data.py |
| REQ-SPEC-007 | Unified Spec | Failure isolation with bounded retries | Runtime | done | runtime/failure/isolation.py, runtime/tools/pipeline.py, runtime/engine.py, tests/test_failure_isolation.py |
| REQ-SPEC-008 | Unified Spec | Artifact index with checksum and provenance | Runtime | done | runtime/engine.py, runtime/storage.py, tests/test_runtime_artifacts_events.py |
| REQ-SPEC-009 | Unified Spec | Context fingerprint tracking | Runtime | done | runtime/quint/fingerprint.py, runtime/engine.py, tests/test_quint_fingerprint.py |
| REQ-SPEC-010 | Unified Spec | Gates recorded as DRRs with evidence links | Runtime | done | runtime/gates.py, runtime/quint/drr.py, runtime/engine.py, tests/test_gates_as_drrs.py |
| REQ-SPEC-011 | Unified Spec | Tool execution pipeline (registry, gating, results) | Runtime | done | runtime/tools/pipeline.py, runtime/engine.py, tests/test_runtime_tool_pipeline.py, tests/test_runtime_engine.py |
- Confirm this plan as source of truth and add change-control note to README. (done: README.md)
- Align branch policy: development is primary, main is release only. (done: README.md)
- Ensure SemVer is enforced in VERSION and release notes. (done: VERSION, docs/versioning.md, CHANGELOG.md, tests/test_versioning.py)
- Enforce Conventional Commits via commit-msg hook. (done: .husky/commit-msg)
- Verify CI uses latest compatible actions and locks versions. (done: .github/workflows/ci.yml; verified 2025-12-25 via GitHub releases API)
- Require CHANGELOG.md update for non-doc changes via pre-commit gate. (done: .husky/pre-commit)
Deliverables:
- docs/v1-plan.md
- README links to plan
- docs/unified-framework-documentation-index.md updated
- Define schemas for RunManifest, RunStep, ArtifactIndex, EvidenceLink, HumanGate. (done: runtime/schemas.py)
- Define event schema for run lifecycle and gate state. (done: runtime/schemas.py)
- Store all schema outputs in a stable JSON format. (done: docs/runtime-schemas.json)
- Add schema validation tests. (done: tests/test_runtime_schemas.py)
Deliverables:
- runtime/models.py updates
- tests for schema round trips
- runtime/schemas.py
- docs/runtime-schemas.json
- tests/test_runtime_schemas.py
- Parse BMAD workflow definitions (md/yaml/xml) into canonical steps. (done: runtime/workflow_parser.py, tests/test_workflow_parser.py)
- Extract explicit outputs/templates only, exclude samples/references. (done: methodology/tools/mapping.py, tests/test_mapping.py)
- Generate registry and mapping files deterministically. (done: methodology/tools/mapping.py, methodology/tools/generate_mapping.py, tests/test_mapping.py)
- Add mapping accuracy tests for production workflows. (done: tests/test_mapping.py)
Deliverables:
- methodology/mapping/registry-workflows.json
- methodology/mapping/integration-mapping.json
- tests/test_mapping.py
- Build a state machine for step progression (pending, running, blocked, failed, completed).
- Implement step contract with inputs, outputs, tools, and validation hooks. (done: runtime/models.py, runtime/engine.py, runtime/tools/pipeline.py, docs/runtime-step-contract.md, tests/test_runtime_engine.py, tests/test_runtime_models.py)
- Persist run state after each step and on errors. (done: runtime/engine.py, runtime/storage.py, tests/test_runtime_engine.py)
- Implement resume semantics from the last incomplete step. (done: runtime/engine.py, tests/test_runtime_engine.py, docs/runtime-step-contract.md)
- Enforce step timeouts and bounded retries.
Deliverables:
- runtime/engine.py execution path
- runtime/execution.py step execution contract
- docs/runtime-step-contract.md
- tests/test_runtime_engine.py
- Define tool interface with risk rating (low, medium, high). (done: runtime/tools/base.py)
- Implement safe file IO and repo operations. (done: runtime/tools/file_io.py, runtime/tools/repo_tool.py)
- Implement LSP query adapters for supported languages. (done: runtime/tools/lsp.py)
- Implement AST and type validation runners. (done: runtime/tools/validation.py, tests/test_runtime_validation.py)
- Add allow-list and block high-risk tools behind HITL. (done: runtime/tools/pipeline.py)
- Define and implement the tool execution pipeline (registry, gating, results). (done: runtime/tools/pipeline.py, runtime/engine.py)
Deliverables:
- runtime/tools/*
- validation executors
- tool risk policy
- docs/tool-execution-pipeline.md
- Implement shard registry, retrieval, and tier budget enforcement.
- Implement LSP symbiosis routing and compression.
- Implement progressive negotiation protocol.
- Implement behavioral cache with TTL and invalidation.
- Enforce validation gates before output acceptance. (done: runtime/engine.py, tests/test_runtime_engine.py)
Deliverables:
- runtime/telis/*
- validation gate hooks
- tests for LSP/shard/cache
- Implement evidence store with L0, L1, L2 levels.
- Implement ADI promotion rules and invalidation handling.
- Implement WLNK and congruence scoring.
- Implement evidence decay and revalidation checks.
- Implement DRR generation and linking to decisions. (done: runtime/quint/drr.py, runtime/storage.py, tests/test_runtime_quint_drr.py)
Deliverables:
- runtime/quint/*
- DRR records stored per run
- tests for evidence and DRR
- Normalize provider request/response models. (done: runtime/providers/base.py, runtime/providers/registry.py, tests/test_runtime_providers.py)
- Implement retries, backoff, and timeouts. (done: runtime/providers/reliability.py, tests/test_runtime_provider_reliability.py)
- Add circuit-breaker logic and rate limit handling. (done: runtime/providers/reliability.py, runtime/providers/http.py, tests/test_runtime_http.py)
- Implement streaming support and chunked responses. (done: runtime/providers/registry.py, runtime/providers/http.py, tests/test_runtime_providers.py)
- Add provider-specific response parsing. (done: runtime/providers/registry.py, runtime/providers/http.py, tests/test_runtime_providers.py)
Deliverables:
- runtime/providers/* improvements
- tests for provider behavior
- Define gate policy by phase and risk level. (done: runtime/gates.py, config/runtime.yaml, tests/test_runtime_gates.py)
- Enforce blocking gates at planning, architecture, release. (done: runtime/gates.py, runtime/engine.py, tests/test_runtime_engine.py)
- Enforce conditional gates for implementation review, tests, security risk. (done: runtime/gates.py, config/runtime.yaml, tests/test_runtime_gates.py)
- Add CLI approvals and audit logs. (done: cli/main.py, runtime/engine.py, runtime/storage.py, tests/test_runtime_engine.py)
Deliverables:
- runtime/gates.py updates
- CLI approve/resume flows
- tests for gate consistency
- Index every output and template with checksum.
- Link artifacts to step, evidence, and gate state.
- Emit run timeline JSON for auditability.
Deliverables:
- runtime/storage.py enhancements
- tests for artifact index
- Add commands: run, resume, approve, status, list, export.
- Add config validation and provider listing.
- Add run history summary.
Deliverables:
- cli/main.py updates
- CLI tests
- Implement PII filter and output validation hooks.
- Implement moderation checks for high-risk content.
- Enforce tool risk safeguards with HITL.
- Add rules-based protections (blocklists, regex checks).
Deliverables:
- runtime/guardrails/*
- tests for guardrail triggers
- Structured logs for run, step, gate, evidence, validation.
- Event emission for external tooling.
- Export run report artifacts.
Deliverables:
- runtime/logging/*
- run report JSON
- End-to-end test for at least one BMAD workflow.
- Integration tests for TELIS and QUINT gates.
- Provider contract tests (mocked).
- Coverage gate >= 85 percent.
- CI passes on development with submodules.
Deliverables:
- tests/* integration suite
- CI green
- Package a full framework install through the BMAD installer (
npx bmad-method@alpha install). - Ensure QUINT and BAQT assets are included alongside BMAD slash commands.
- Add an installation verification test or checklist.
Deliverables:
- Installer integration notes
- Install validation check
- Final documentation sweep and index update.
- Version bump to 1.0.0 and changelog.
- Tag release on main after development is green.
- Release checklist sign-off.
Deliverables:
- VERSION updated
- release notes
- tags on main
- Confirm plan is the source of truth and update README link
- Complete runtime data model and schemas
- Implement BMAD workflow parsing and canonical registry
- Build step execution state machine
- Implement tool adapters, execution pipeline, and risk policy
- Implement TELIS LSP, shards, negotiation, cache, and validation gates (LSP adapter done)
- Implement QUINT evidence engine, WLNK, congruence, decay, and DRR
- Harden provider routing with retries and circuit breakers
- Enforce HITL gating with CLI approvals
- Add artifact indexing and run timeline
- Expand CLI (list, export, history)
- Implement guardrails (PII, moderation, rules)
- Add observability and run reports
- Integration tests for at least one BMAD workflow
- CI green with coverage >= 85 percent
- Package full framework installer (BMAD + QUINT + BAQT)
- Release docs and tag v1.0.0 on main
We will not claim full coverage until the traceability audit is complete. The current audit lives at docs/traceability-audit.md and must be updated after each workstream.
Required checks:
- Cross-check all BMAD workflows and outputs against the registry.
- Map TELIS requirements to concrete runtime modules and tests.
- Map QUINT requirements to evidence, DRR, and audit outputs.
- Map Practical Guide guardrails to runtime guardrail implementations.
The traceability audit must be marked done before v1.0 release.