Skip to content

Latest commit

 

History

History
350 lines (259 loc) · 22.8 KB

File metadata and controls

350 lines (259 loc) · 22.8 KB

BAQT v1.0 Release Plan

Status: Complete Owner: Core maintainers Last updated: 2026-01-03

Purpose

This document is the single source of truth for delivering the BAQT super-framework to v1.0. All work must map to this plan. Anything not listed here is out of scope for v1.0.

Scope Lock

In scope:

  • Fully automated execution with blocking human-in-the-loop (HITL) gates.
  • Pre-development phases default to manual; optional CLI menu enables automation with HITL when approved.
  • End-to-end execution of real BMAD workflows with correct artifacts.
  • TELIS token efficiency controls and validation gates.
  • QUINT evidence and DRR workflows with auditable trails.
  • Provider routing across Ollama, LiteLLM, OpenAI, Claude, Gemini, Groq.
  • CLI-first approval and control.

Out of scope for v1.0:

  • GUI or web-based approvals.
  • Long-running distributed execution across multiple hosts.
  • Multi-tenant user management.
  • Full IDE plugin deployment and installation automation beyond CLI.

Change control:

  • Any new requirement must be added here with a linked justification and an owner.
  • No implementation work begins without a checklist item.
  • CHANGELOG.md must be updated for any non-doc change; enforce via pre-commit gate.

Sources of Truth

  • BMAD-METHOD (workflows, agents, output conventions)
  • TELIS (LSP, shards, progressive negotiation, validation gates)
  • QUINT (ADI cycle, evidence levels, DRR, WLNK, congruence, decay)
  • docs/a-practical-guide-to-building-agents.md (model-tool-instructions, guardrails, HITL triggers)
  • docs/traceability-audit.md (requirement mapping and evidence)

v1.0 Definition of Done

  • At least one BMAD workflow executes end-to-end and produces required outputs and templates.
  • HITL gates block on planning, architecture, and release stages and can be approved and resumed by CLI.
  • TELIS policy is enforced: LSP symbiosis, shards, progressive negotiation, validation gates.
  • QUINT evidence and DRR records are generated and linked to decisions.
  • Provider routing is robust with retries, backoff, and error normalization.
  • Mapping and registry generation excludes sample/reference workflows and lists explicit outputs/templates only.
  • CI passes with coverage gate >= 85 percent and integration tests.

Requirement Coverage Matrix

Status values: todo, in-progress, done, partial, missing, blocked

Full traceability with evidence is maintained in docs/traceability-audit.md. This matrix must stay in sync.

ID Source Requirement Owner Status Acceptance Evidence
REQ-BMAD-001 BMAD Preserve BMAD workflows and naming without rewriting logic Runtime done tests/test_runtime_bmad_execution.py
REQ-BMAD-002 BMAD Sample/reference workflows excluded from production mapping Mapping done tests/test_mapping.py
REQ-BMAD-003 BMAD Only explicit outputs/templates listed in mapping Mapping done tests/test_mapping.py
REQ-BMAD-004 BMAD Parse workflow definitions (md/yaml/xml) into steps Runtime done tests/test_workflow_parser.py
REQ-BMAD-005 BMAD Enforce output folder/layout conventions Runtime done runtime/engine.py, tests/test_runtime_engine.py
REQ-BMAD-006 BMAD Orchestrator routes workflows and agents Runtime done runtime/orchestrator/engine.py, runtime/orchestrator/router.py, tests/test_orchestrator_engine.py
REQ-BMAD-007 BMAD Support all BMAD modules (core, BMM, BMB, CIS, BMGD) Runtime done tests/test_runtime_bmad_execution.py
REQ-INSTALL-001 Distribution Single-command install for full framework (BMAD + QUINT + BAQT) Release done installer/core.py, installer/cli.py, installer/verify.py, bin/baqt.js, docs/installation-guide.md
REQ-TELIS-001 TELIS LSP symbiosis for type/signature accuracy Tools done runtime/tools/lsp.py, runtime/telis/context.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py
REQ-TELIS-002 TELIS LSP fallback to shards on failure Tools done runtime/telis/context.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py
REQ-TELIS-003 TELIS Tiered knowledge shards with token budgets TELIS done runtime/telis/shards.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py
REQ-TELIS-004 TELIS Progressive context negotiation protocol TELIS done runtime/telis/negotiation.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py
REQ-TELIS-005 TELIS AST/type/lint validation pipeline TELIS done runtime/tools/validation.py, runtime/engine.py, tests/test_runtime_validation.py, tests/test_runtime_engine.py
REQ-TELIS-006 TELIS Behavioral cache with TTL and invalidation TELIS done runtime/telis/cache.py, runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py
REQ-TELIS-007 TELIS Validation failures trigger retry/escalation TELIS done runtime/engine.py, tests/test_runtime_engine.py
REQ-QUINT-001 QUINT Evidence store for L0/L1/L2 and invalid QUINT done runtime/quint/store.py, tests/test_runtime_quint_evidence.py
REQ-QUINT-002 QUINT ADI cycle with promotion rules QUINT done runtime/quint/adi.py, tests/test_runtime_quint_adi.py
REQ-QUINT-003 QUINT WLNK assurance scoring QUINT done runtime/quint/assurance.py, tests/test_runtime_quint_assurance.py
REQ-QUINT-004 QUINT Congruence scoring for external evidence QUINT done runtime/quint/assurance.py, tests/test_runtime_quint_assurance.py
REQ-QUINT-005 QUINT Evidence decay and revalidation QUINT done runtime/quint/decay.py, tests/test_runtime_quint_decay.py
REQ-QUINT-006 QUINT DRR generation for major decisions QUINT done runtime/quint/drr.py, tests/test_runtime_quint_drr.py
REQ-QUINT-007 QUINT Surface vs grounding separation QUINT done runtime/quint/drr.py, runtime/storage.py, tests/test_quint_surface_grounding.py
REQ-QUINT-008 QUINT Bounded context snapshot and drift detection QUINT done runtime/quint/snapshot.py, runtime/quint/drift.py, runtime/engine.py, tests/test_quint_snapshot_drift.py
REQ-AGENT-001 Practical Guide Model/tool/instructions triad per agent Runtime done runtime/agents.py, runtime/execution.py, tests/test_agents_registry.py
REQ-AGENT-002 Practical Guide Standardized tool definitions and reuse Runtime done runtime/tools/registry.py, runtime/tools/pipeline.py, tests/test_tools_registry.py
REQ-AGENT-003 Practical Guide Tool risk ratings and safeguards Runtime done runtime/tools/base.py, runtime/tools/pipeline.py, runtime/engine.py, runtime/guardrails/checks.py, tests/test_runtime_guardrails.py
REQ-AGENT-004 Practical Guide PII filter and data privacy guardrails Runtime done runtime/guardrails/checks.py, runtime/engine.py, tests/test_runtime_guardrails.py, tests/test_runtime_engine.py
REQ-AGENT-005 Practical Guide Moderation filters for unsafe inputs Runtime done runtime/guardrails/checks.py, runtime/engine.py, tests/test_runtime_guardrails.py
REQ-AGENT-006 Practical Guide Rules-based protections (blocklists/regex) Runtime done runtime/guardrails/checks.py, runtime/engine.py, tests/test_runtime_guardrails.py
REQ-AGENT-007 Practical Guide HITL on high-risk actions and retry thresholds Runtime done runtime/gates.py, runtime/engine.py, runtime/tools/pipeline.py, config/runtime.yaml, tests/test_runtime_gates.py
REQ-AGENT-008 Practical Guide Optimistic execution with concurrent guardrails Runtime done runtime/guardrails/concurrent.py, runtime/engine.py, tests/test_guardrails_concurrent.py
REQ-SPEC-001 Unified Spec Event bus for workflow state transitions Runtime done runtime/events/bus.py, runtime/events/handlers.py, runtime/engine.py, tests/test_events_bus.py
REQ-SPEC-002 Unified Spec State store for workflow progress and artifacts Runtime done runtime/engine.py, runtime/storage.py, tests/test_runtime_artifacts_events.py
REQ-SPEC-003 Unified Spec TELIS policy engine and context manager TELIS done runtime/telis/manager.py, runtime/engine.py, tests/test_runtime_telis_manager.py
REQ-SPEC-004 Unified Spec Evidence store for Quint claims and DRRs QUINT done runtime/quint/store.py, runtime/quint/drr.py, tests/test_runtime_quint_evidence.py, tests/test_runtime_quint_drr.py
REQ-SPEC-005 Unified Spec Control plane/data plane split Runtime done runtime/plugins/control_plane.py, runtime/plugins/data_plane.py, runtime/plugins/manager.py, tests/test_plugins_control_data.py
REQ-SPEC-006 Unified Spec Plugin pipeline for policy/adapters/observability Runtime done runtime/plugins/implementations, runtime/logging/emitter.py, tests/test_plugins_control_data.py
REQ-SPEC-007 Unified Spec Failure isolation with bounded retries Runtime done runtime/failure/isolation.py, runtime/tools/pipeline.py, runtime/engine.py, tests/test_failure_isolation.py
REQ-SPEC-008 Unified Spec Artifact index with checksum and provenance Runtime done runtime/engine.py, runtime/storage.py, tests/test_runtime_artifacts_events.py
REQ-SPEC-009 Unified Spec Context fingerprint tracking Runtime done runtime/quint/fingerprint.py, runtime/engine.py, tests/test_quint_fingerprint.py
REQ-SPEC-010 Unified Spec Gates recorded as DRRs with evidence links Runtime done runtime/gates.py, runtime/quint/drr.py, runtime/engine.py, tests/test_gates_as_drrs.py
REQ-SPEC-011 Unified Spec Tool execution pipeline (registry, gating, results) Runtime done runtime/tools/pipeline.py, runtime/engine.py, tests/test_runtime_tool_pipeline.py, tests/test_runtime_engine.py

Workstreams and Steps

WS1: Governance and repo hygiene

  1. Confirm this plan as source of truth and add change-control note to README. (done: README.md)
  2. Align branch policy: development is primary, main is release only. (done: README.md)
  3. Ensure SemVer is enforced in VERSION and release notes. (done: VERSION, docs/versioning.md, CHANGELOG.md, tests/test_versioning.py)
  4. Enforce Conventional Commits via commit-msg hook. (done: .husky/commit-msg)
  5. Verify CI uses latest compatible actions and locks versions. (done: .github/workflows/ci.yml; verified 2025-12-25 via GitHub releases API)
  6. Require CHANGELOG.md update for non-doc changes via pre-commit gate. (done: .husky/pre-commit)

Deliverables:

  • docs/v1-plan.md
  • README links to plan
  • docs/unified-framework-documentation-index.md updated

WS2: Canonical runtime data model

  1. Define schemas for RunManifest, RunStep, ArtifactIndex, EvidenceLink, HumanGate. (done: runtime/schemas.py)
  2. Define event schema for run lifecycle and gate state. (done: runtime/schemas.py)
  3. Store all schema outputs in a stable JSON format. (done: docs/runtime-schemas.json)
  4. Add schema validation tests. (done: tests/test_runtime_schemas.py)

Deliverables:

  • runtime/models.py updates
  • tests for schema round trips
  • runtime/schemas.py
  • docs/runtime-schemas.json
  • tests/test_runtime_schemas.py

WS3: Mapping and registry pipeline

  1. Parse BMAD workflow definitions (md/yaml/xml) into canonical steps. (done: runtime/workflow_parser.py, tests/test_workflow_parser.py)
  2. Extract explicit outputs/templates only, exclude samples/references. (done: methodology/tools/mapping.py, tests/test_mapping.py)
  3. Generate registry and mapping files deterministically. (done: methodology/tools/mapping.py, methodology/tools/generate_mapping.py, tests/test_mapping.py)
  4. Add mapping accuracy tests for production workflows. (done: tests/test_mapping.py)

Deliverables:

  • methodology/mapping/registry-workflows.json
  • methodology/mapping/integration-mapping.json
  • tests/test_mapping.py

WS4: Workflow executor

  1. Build a state machine for step progression (pending, running, blocked, failed, completed).
  2. Implement step contract with inputs, outputs, tools, and validation hooks. (done: runtime/models.py, runtime/engine.py, runtime/tools/pipeline.py, docs/runtime-step-contract.md, tests/test_runtime_engine.py, tests/test_runtime_models.py)
  3. Persist run state after each step and on errors. (done: runtime/engine.py, runtime/storage.py, tests/test_runtime_engine.py)
  4. Implement resume semantics from the last incomplete step. (done: runtime/engine.py, tests/test_runtime_engine.py, docs/runtime-step-contract.md)
  5. Enforce step timeouts and bounded retries.

Deliverables:

  • runtime/engine.py execution path
  • runtime/execution.py step execution contract
  • docs/runtime-step-contract.md
  • tests/test_runtime_engine.py

WS5: Tool adapter layer

  1. Define tool interface with risk rating (low, medium, high). (done: runtime/tools/base.py)
  2. Implement safe file IO and repo operations. (done: runtime/tools/file_io.py, runtime/tools/repo_tool.py)
  3. Implement LSP query adapters for supported languages. (done: runtime/tools/lsp.py)
  4. Implement AST and type validation runners. (done: runtime/tools/validation.py, tests/test_runtime_validation.py)
  5. Add allow-list and block high-risk tools behind HITL. (done: runtime/tools/pipeline.py)
  6. Define and implement the tool execution pipeline (registry, gating, results). (done: runtime/tools/pipeline.py, runtime/engine.py)

Deliverables:

  • runtime/tools/*
  • validation executors
  • tool risk policy
  • docs/tool-execution-pipeline.md

WS6: TELIS context manager

  1. Implement shard registry, retrieval, and tier budget enforcement.
  2. Implement LSP symbiosis routing and compression.
  3. Implement progressive negotiation protocol.
  4. Implement behavioral cache with TTL and invalidation.
  5. Enforce validation gates before output acceptance. (done: runtime/engine.py, tests/test_runtime_engine.py)

Deliverables:

  • runtime/telis/*
  • validation gate hooks
  • tests for LSP/shard/cache

WS7: QUINT evidence engine

  1. Implement evidence store with L0, L1, L2 levels.
  2. Implement ADI promotion rules and invalidation handling.
  3. Implement WLNK and congruence scoring.
  4. Implement evidence decay and revalidation checks.
  5. Implement DRR generation and linking to decisions. (done: runtime/quint/drr.py, runtime/storage.py, tests/test_runtime_quint_drr.py)

Deliverables:

  • runtime/quint/*
  • DRR records stored per run
  • tests for evidence and DRR

WS8: Provider routing and reliability

  1. Normalize provider request/response models. (done: runtime/providers/base.py, runtime/providers/registry.py, tests/test_runtime_providers.py)
  2. Implement retries, backoff, and timeouts. (done: runtime/providers/reliability.py, tests/test_runtime_provider_reliability.py)
  3. Add circuit-breaker logic and rate limit handling. (done: runtime/providers/reliability.py, runtime/providers/http.py, tests/test_runtime_http.py)
  4. Implement streaming support and chunked responses. (done: runtime/providers/registry.py, runtime/providers/http.py, tests/test_runtime_providers.py)
  5. Add provider-specific response parsing. (done: runtime/providers/registry.py, runtime/providers/http.py, tests/test_runtime_providers.py)

Deliverables:

  • runtime/providers/* improvements
  • tests for provider behavior

WS9: HITL gate enforcement

  1. Define gate policy by phase and risk level. (done: runtime/gates.py, config/runtime.yaml, tests/test_runtime_gates.py)
  2. Enforce blocking gates at planning, architecture, release. (done: runtime/gates.py, runtime/engine.py, tests/test_runtime_engine.py)
  3. Enforce conditional gates for implementation review, tests, security risk. (done: runtime/gates.py, config/runtime.yaml, tests/test_runtime_gates.py)
  4. Add CLI approvals and audit logs. (done: cli/main.py, runtime/engine.py, runtime/storage.py, tests/test_runtime_engine.py)

Deliverables:

  • runtime/gates.py updates
  • CLI approve/resume flows
  • tests for gate consistency

WS10: Artifact index and audit trail

  1. Index every output and template with checksum.
  2. Link artifacts to step, evidence, and gate state.
  3. Emit run timeline JSON for auditability.

Deliverables:

  • runtime/storage.py enhancements
  • tests for artifact index

WS11: CLI and user experience

  1. Add commands: run, resume, approve, status, list, export.
  2. Add config validation and provider listing.
  3. Add run history summary.

Deliverables:

  • cli/main.py updates
  • CLI tests

WS12: Guardrails and safety

  1. Implement PII filter and output validation hooks.
  2. Implement moderation checks for high-risk content.
  3. Enforce tool risk safeguards with HITL.
  4. Add rules-based protections (blocklists, regex checks).

Deliverables:

  • runtime/guardrails/*
  • tests for guardrail triggers

WS13: Observability

  1. Structured logs for run, step, gate, evidence, validation.
  2. Event emission for external tooling.
  3. Export run report artifacts.

Deliverables:

  • runtime/logging/*
  • run report JSON

WS14: Test strategy and CI hardening

  1. End-to-end test for at least one BMAD workflow.
  2. Integration tests for TELIS and QUINT gates.
  3. Provider contract tests (mocked).
  4. Coverage gate >= 85 percent.
  5. CI passes on development with submodules.

Deliverables:

  • tests/* integration suite
  • CI green

WS15: Distribution and installer integration

  1. Package a full framework install through the BMAD installer (npx bmad-method@alpha install).
  2. Ensure QUINT and BAQT assets are included alongside BMAD slash commands.
  3. Add an installation verification test or checklist.

Deliverables:

  • Installer integration notes
  • Install validation check

WS16: Release readiness

  1. Final documentation sweep and index update.
  2. Version bump to 1.0.0 and changelog.
  3. Tag release on main after development is green.
  4. Release checklist sign-off.

Deliverables:

  • VERSION updated
  • release notes
  • tags on main

Master Checklist (v1.0)

  • Confirm plan is the source of truth and update README link
  • Complete runtime data model and schemas
  • Implement BMAD workflow parsing and canonical registry
  • Build step execution state machine
  • Implement tool adapters, execution pipeline, and risk policy
  • Implement TELIS LSP, shards, negotiation, cache, and validation gates (LSP adapter done)
  • Implement QUINT evidence engine, WLNK, congruence, decay, and DRR
  • Harden provider routing with retries and circuit breakers
  • Enforce HITL gating with CLI approvals
  • Add artifact indexing and run timeline
  • Expand CLI (list, export, history)
  • Implement guardrails (PII, moderation, rules)
  • Add observability and run reports
  • Integration tests for at least one BMAD workflow
  • CI green with coverage >= 85 percent
  • Package full framework installer (BMAD + QUINT + BAQT)
  • Release docs and tag v1.0.0 on main

Verification and Gap Audit

We will not claim full coverage until the traceability audit is complete. The current audit lives at docs/traceability-audit.md and must be updated after each workstream.

Required checks:

  • Cross-check all BMAD workflows and outputs against the registry.
  • Map TELIS requirements to concrete runtime modules and tests.
  • Map QUINT requirements to evidence, DRR, and audit outputs.
  • Map Practical Guide guardrails to runtime guardrail implementations.

The traceability audit must be marked done before v1.0 release.