System architecture for contributors and advanced users. For user-facing documentation, see Feature Reference or User Guide.
- System Overview
- Design Principles
- Component Architecture
- Agent Model
- Data Flow
- File System Layout
- Installer Architecture
- Hook System
- CLI Tools Layer
- Runtime Abstraction
GSD Core is a meta-prompting framework that sits between the user and AI coding agents (Claude Code, Gemini CLI, Kimi CLI, OpenCode, Kilo, Codex, Copilot, Antigravity, Trae, Cline, Augment Code). It provides:
- Context engineering β Structured artifacts that give the AI everything it needs per task (see Context engineering)
- Multi-agent orchestration β Thin orchestrators that spawn specialized agents with fresh context windows (see Multi-agent orchestration)
- Spec-driven development β Requirements β research β plans β execution β verification pipeline
- State management β Persistent project memory across sessions and context resets
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER β
β /gsd-command [args] β
βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β COMMAND LAYER β
β commands/gsd/*.md β Prompt-based command files β
β (Claude Code custom commands / Codex skills) β
βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β WORKFLOW LAYER β
β gsd-core/workflows/*.md β Orchestration logic β
β (Reads references, spawns agents, manages state) β
ββββββββ¬βββββββββββββββ¬ββββββββββββββββββ¬βββββββββββββββ
β β β
ββββββββΌβββββββ βββββββΌββββββ ββββββββββΌββββββββ
β AGENT β β AGENT β β AGENT β
β (fresh β β (fresh β β (fresh β
β context) β β context)β β context) β
ββββββββ¬βββββββ βββββββ¬ββββββ ββββββββββ¬ββββββββ
β β β
ββββββββΌβββββββββββββββΌββββββββββββββββββΌβββββββββββββββ
β CLI TOOLS LAYER β
β gsd-tools.cjs command families + domain modules β
β command-routing-hub + observability seams β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββ
β FILE SYSTEM (.planning/) β
β PROJECT.md | REQUIREMENTS.md | ROADMAP.md β
β STATE.md | config.json | phases/ | research/ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Every agent spawned by an orchestrator gets a clean context window (up to 200K tokens). This eliminates context rot β the quality degradation that happens as an AI fills its context window with accumulated conversation.
Workflow files (gsd-core/workflows/*.md) never do heavy lifting. They:
- Load context via
gsd-tools.cjs init <workflow> - Spawn specialized agents with focused prompts
- Collect results and route to the next step
- Update state between steps
All state lives in .planning/ as human-readable Markdown and JSON. No database, no server, no external dependencies. This means:
- State survives context resets (
/clear) - State is inspectable by both humans and agents
- State can be committed to git for team visibility
Workflow feature flags follow the absent = enabled pattern. If a key is missing from config.json, it defaults to true. Users explicitly disable features; they don't need to enable defaults.
Multiple layers prevent common failure modes:
- Plans are verified before execution (plan-checker agent)
- Execution produces atomic commits per task
- Post-execution verification checks against phase goals
- UAT provides human verification as final gate
User-facing entry points. Each file contains YAML frontmatter (name, description, allowed-tools) and a prompt body that bootstraps the workflow. Commands are installed as:
- Claude Code: Custom slash commands (hyphen form,
/gsd-command-name) - OpenCode / Kilo: Slash commands (hyphen form,
/gsd-command-name) - Codex: Skills (
$gsd-command-name) - Copilot: Slash commands (hyphen form,
/gsd-command-name) - Gemini CLI: Slash commands under the
gsd:namespace (colon form,/gsd:command-name) β Gemini namespaces all custom commands under their plugin id, so the install path rewrites every body-text reference to colon form - Kimi CLI: Agent Skills (
/skill:gsd-command-name) plus an explicit custom agent launch withkimi --agent-file - Antigravity: Skills
Total commands: see docs/INVENTORY.md for the authoritative count and full roster.
Two-stage hierarchical routing (v1.40, #2792)
To keep the eager skill-listing token cost low, v1.40 introduces six namespace meta-skills (gsd-workflow, gsd-project, gsd-quality, gsd-context, gsd-manage, gsd-ideate β sourced from commands/gsd/ns-*.md, but the invocable name: is the bare form shown here) layered above the concrete sub-skills. On runtimes with non-recursive skill loaders (claude global, cline, qwen, hermes, augment, trae, antigravity) the installer now realizes this fully: it emits only the 6 namespace router bundles as top-level skills and nests the ~61 concrete skills under <router>/skills/<name>/SKILL.md, so the eager listing is β6 entries instead of β67. The model selects a namespace router, which instructs it to read the nested concrete skill file via a routing table embedded in the router body. On these runtimes concrete skills are not directly invocable by bare name via the Skill tool; they are reachable through the router. Slash commands (/gsd-*, via the separate commands surface) are unaffected where the runtime has one. On runtimes with recursive or unconfirmed skill loaders (cursor, codex, copilot, windsurf, codebuddy, opencode, kilo) the layout remains flat β all skills emitted at the top level as before.
The router descriptions use pipe-separated keyword tags (β€ 60 chars) per the Tool Attention research showing keyword-dense tags outperform prose for routing at ~40 % the token cost.
The eager skill listing is one of two recurring per-turn token costs. The other is the MCP tool schema injected by every enabled MCP server in .claude/settings.json. Heavyweight MCP servers (browser/playwright, Mac-tools, Windows-tools) can each cost 20 k+ tokens per turn β often dwarfing what model_profile tuning saves. The toggle lives in the Claude Code harness (enabledMcpjsonServers / disabledMcpjsonServers in .claude/settings.json) and is not a GSD concern. Together, the two-stage routing layer (#2792) and disciplined MCP enablement are the largest cost levers per turn. See docs/USER-GUIDE.md and references/context-budget.md for the audit checklist.
Orchestration logic that commands reference. Contains the step-by-step process including:
- Context loading via
gsd-tools.cjs inithandlers - Agent spawn instructions with model resolution
- Gate/checkpoint definitions
- State update patterns
- Error handling and recovery
Total workflows: see docs/INVENTORY.md for the authoritative count and full roster.
Workflow files are loaded verbatim into Claude's context every time the
corresponding /gsd-* command is invoked. The workflow size budget enforced by
tests/workflow-size-budget.test.cjs keeps each file bounded, mirroring the
agent budget from #2361. The budget is measured in bytes (#717), not lines:
line count over-penalizes prose and under-catches token-dense tables and code
blocks, whereas bytes are deterministic and match the unit our vendors bound on
β Codex truncates instruction docs past 32,768 bytes (project_doc_max_bytes).
We adopt that unit, not that exact number: the XL/LARGE ceilings below sit above
32,768 because these are grandfathered top-level orchestrators loaded by Claude,
not Codex AGENTS.md docs.
| Tier | Per-file byte limit |
|---|---|
XL |
90,000 β top-level orchestrators (execute-phase, plan-phase, new-project) |
LARGE |
54,000 β multi-step planners and large feature workflows |
DEFAULT |
38,000 β focused single-purpose workflows (the target tier) |
Ceilings are not fixed forever: under the tighten-only ratchet (#597) each one tracks its tier's current high-water mark within a small grace band, so budgets may only decrease over time.
Why the budget exists. With prompt caching the per-invocation cost of a large workflow is modest (cache reads run ~10% of input). The stronger, caching-independent reason is quality: as context grows, recall and reasoning degrade ("context rot" / attention budget), so leaner, higher-signal instructions produce better plans. The ceiling protects the agent's attention, not just the token bill.
Because the budget measures one file, it is a proxy for the real goal β
bounded loaded context. Extraction only helps when the extracted content is
loaded lazily (Read at the step that needs it). Moving prose into a file
that is still eagerly @-imported shrinks the measured file without shrinking
loaded context, which games the proxy rather than serving the goal.
workflows/discuss-phase.md is held to a stricter <30,000-byte ceiling per
issue #2551 (originally <500 lines; re-based to bytes for #717). When a workflow grows
beyond its tier, extract per-mode bodies into
workflows/<workflow>/modes/<mode>.md, templates into
workflows/<workflow>/templates/, and shared knowledge into
gsd-core/references/. The parent file becomes a thin dispatcher that
Reads only the mode and template files needed for the current invocation.
workflows/discuss-phase/ is the canonical example of this pattern β
parent dispatches, modes/ holds per-flag behavior (power.md, all.md,
auto.md, chain.md, text.md, batch.md, analyze.md, default.md,
advisor.md), and templates/ holds CONTEXT.md, DISCUSSION-LOG.md, and
checkpoint.json schemas that are read only when the corresponding output
file is being written.
workflows/plan-phase.md, workflows/execute-phase.md, and the
gsd-planner / gsd-executor agent definitions apply the same discipline
to their MVP-only reference bodies β planner-mvp-mode.md,
user-story-template.md, skeleton-template.md, and execute-mvp-tdd.md
are referenced for the planner/executor to Read only on MVP,
Walking-Skeleton, or MVP+TDD paths, rather than eagerly @-imported, so
non-MVP runs do not pay their context cost (guards against the "@-import
behind a conditional still loads eagerly" leak; see #720). The dedicated
mvp-phase workflow keeps its eager imports, since it is always MVP.
Specialized agent definitions with frontmatter specifying:
nameβ Agent identifierdescriptionβ Role and purposetoolsβ Allowed tool access (Read, Write, Edit, Bash, Grep, Glob, WebSearch, etc.)colorβ Terminal output color for visual distinction
Total agents: 33
Shared knowledge documents that workflows and agents @-reference (see docs/INVENTORY.md for the authoritative full roster):
Core references:
checkpoints.mdβ Checkpoint type definitions and interaction patternsgates.mdβ 4 canonical gate types (Confirm, Quality, Safety, Transition) wired into plan-checker and verifiermodel-profiles.mdβ Per-agent model tier assignmentsmodel-profile-resolution.mdβ Model resolution algorithm documentationverification-patterns.mdβ How to verify different artifact typesverification-overrides.mdβ Per-artifact verification override rulesplanning-config.mdβ Full config schema and behaviorgit-integration.mdβ Git commit, branching, and history patternsgit-planning-commit.mdβ Planning directory commit conventionsquestioning.mdβ Dream extraction philosophy for project initializationtdd.mdβ Test-driven development integration patternsui-brand.mdβ Visual output formatting patternscommon-bug-patterns.mdβ Common bug patterns for code review and verification
Workflow references:
agent-contracts.mdβ Formal interface between orchestrators and agentscontext-budget.mdβ Context window budget allocation rulescontinuation-format.mdβ Session continuation/resume formatdomain-probes.mdβ Domain-specific probing questions for discuss-phasegate-prompts.mdβ Gate/checkpoint prompt templatesrevision-loop.mdβ Plan revision iteration patternsuniversal-anti-patterns.mdβ Common anti-patterns to detect and avoidartifact-types.mdβ Planning artifact type definitionsphase-argument-parsing.mdβ Phase argument parsing conventionsdecimal-phase-calculation.mdβ Decimal sub-phase numbering rulesworkstream-flag.mdβ Workstream active pointer conventionsuser-profiling.mdβ User behavioral profiling methodologythinking-partner.mdβ Conditional thinking partner activation at decision points
Thinking model references:
References for integrating thinking-class models (o3, o4-mini, Gemini 2.5 Pro) into GSD workflows:
thinking-models-debug.mdβ Thinking model patterns for debugging workflowsthinking-models-execution.mdβ Thinking model patterns for execution agentsthinking-models-planning.mdβ Thinking model patterns for planning agentsthinking-models-research.mdβ Thinking model patterns for research agentsthinking-models-verification.mdβ Thinking model patterns for verification agents
Modular planner decomposition:
The planner agent (agents/gsd-planner.md) was decomposed from a single monolithic file into a core agent plus reference modules to stay under the 50K character limit imposed by some runtimes:
planner-gap-closure.mdβ Gap closure mode behavior (reads VERIFICATION.md, targeted replanning)planner-reviews.mdβ Cross-AI review integration (reads REVIEWS.md from/gsd-review)planner-revision.mdβ Plan revision patterns for iterative refinement
Markdown templates for all planning artifacts. Used by gsd-tools.cjs template fill / phase.scaffold (and top-level scaffold) to create pre-structured files:
project.md,requirements.md,roadmap.md,state.mdβ Core project filesphase-prompt.mdβ Phase execution prompt templatesummary.md(+summary-minimal.md,summary-standard.md,summary-complex.md) β Granularity-aware summary templatesDEBUG.mdβ Debug session tracking templateUI-SPEC.md,UAT.md,VALIDATION.mdβ Specialized verification templatesdiscussion-log.mdβ Discussion audit trail templatecodebase/β Brownfield mapping templates (stack, architecture, conventions, concerns, structure, testing, integrations)research-project/β Research output templates (SUMMARY, STACK, FEATURES, ARCHITECTURE, PITFALLS)
Runtime hooks that integrate with the host AI agent:
| Hook | Event | Purpose |
|---|---|---|
gsd-statusline.js |
statusLine |
Displays model, task, directory, and context usage bar |
gsd-context-monitor.js |
PostToolUse / AfterTool |
Injects agent-facing context warnings at 35%/25% remaining |
gsd-check-update.js |
SessionStart |
Foreground trigger for the background update check |
gsd-ensure-canonical-path.js |
SessionStart |
For Claude Code plugin installs, symlinks ~/.claude/gsd-core/{bin,contexts,references,templates,workflows} to the plugin's bundled tree so @~/.claude/gsd-core/... includes resolve; runs first in SessionStart, no-op in classic installs, self-heals after claude plugin update (#997) |
gsd-check-update-worker.js |
(helper) | Background worker spawned by gsd-check-update.js; no direct event registration |
gsd-prompt-guard.js |
PreToolUse |
Scans .planning/ writes for prompt injection patterns (advisory) |
gsd-read-injection-scanner.js |
PostToolUse |
Scans Read tool output for injected instructions in untrusted content |
gsd-workflow-guard.js |
PreToolUse |
Detects file edits outside GSD workflow context (advisory, opt-in via hooks.workflow_guard) |
gsd-read-guard.js |
PreToolUse |
Advisory guard preventing Edit/Write on files not yet read in the session |
gsd-session-state.sh |
PostToolUse |
Session state tracking for shell-based runtimes |
gsd-validate-commit.sh |
PostToolUse |
Commit validation for conventional commit enforcement |
gsd-phase-boundary.sh |
PostToolUse |
Phase boundary detection for workflow transitions |
See docs/INVENTORY.md for the authoritative hook roster.
CJS command family routers dispatch through CommandRoutingHub. The hub owns the no-throw pure-result contract (hub.dispatch() catches internal exceptions and returns { ok: false, kind, ...typedPayload }) and the closed runtime error taxonomy (UnknownCommand, InvalidArgs, HandlerRefusal, HandlerFailure). Router adapters remain thin CLI translators β they build the hub, call dispatch, then map the Result to output()/error() calls. The runtime is single-path (no dual-runtime mode selection). See docs/adr/0174-retire-gsd-sdk-package-boundary.md.
The Research Module implements an L2-hybrid seam: code owns the cache, provider policy, and package legitimacy verdicts; MCP owns the actual network fetch.
Three compiled modules (generated to gsd-core/bin/lib/*.cjs per ADR-457) are reachable via gsd-tools query research-plan | research-store | package-legitimacy:
- Research Store β content-addressed cache (
sha256(ecosystem+library+version+query+kind)) with per-source TTL (curated-doc: 30 d, medium: 7 d, web/synthesis: 1 d) and two storage tiers:~/.gsd/research-cachefor cross-project curated-doc hits,.planning/research/.cachefor project-local web/synthesis results. - Research Provider β single
PROVIDER_WATERFALL(Context7βRefβJinaβwebsearchfor docs;ExaβTavilyβPerplexityβBraveβwebsearchfor web;FirecrawlβJinafor scrape-only).planResearch()returns cache hits plus a fetch plan;classifyConfidence()stampsHIGH|MEDIUM|LOWby provider tier. - Package Legitimacy β registry-API verdicts (npm/PyPI/crates.io injectable adapters) producing
OK|SUS|SLOPper package.slopcheckis an optional escalate-only adapter; absence leaves registry verdicts intact rather than downgrading everything to[ASSUMED].
Data flow:
agent
β
βΌ
gsd-tools query research-plan β Research Provider: check cache, build fetch plan
β
βββ [cache hits] βββββββββββββββββββΊ RESEARCH.md (digest only, no raw content)
β
βββ [fetch plan] βββββββββββββββββββΊ MCP fetch (agent calls MCP tools with the plan)
β
βΌ
gsd-tools query research-store (put)
β
βΌ
RESEARCH.md path returned to orchestrator
Agents always return a RESEARCH.md path, never raw fetched content. Context discipline is enforced through subagent isolation, compact provider output, and fetch-to-disk. See ADR-0656.
Node.js CLI utility (gsd-tools.cjs) with domain modules split across gsd-core/bin/lib/ (see docs/INVENTORY.md for the authoritative roster):
| Module | Responsibility |
|---|---|
config-loader.cjs |
Project config loading β defaults merge, legacy-key migration, workstream overlay, unknown-key/profile-override validation, and federated config overlay (ADR-857 phase 3b) (extracted from core.cjs, ADR-857) |
federated-config.cjs |
Defensive merge of capability-declared config slices (ADR-857 phase 3b); exports mergeFederatedConfig; live for migrated Capability keys that are absent from the central config schema |
core-utils.cjs |
Shared low-level utility primitives β POSIX path normalization, sub-repo/subdirectory scanning, phase file stats, slug/one-liner/plan-id helpers, time-ago (extracted from core.cjs, ADR-857) |
core.cjs |
Shared utilities; compatibility re-exports for planning, I/O (io.cjs), and phase-id helpers |
io.cjs |
CLI I/O primitives β output/error emission, JSON-error mode, large-payload temp-file spillover |
phase-id.cjs |
Pure phase-id parsing/matching helpers β normalize, token match, regex builders (extracted from core.cjs, ADR-857) |
phase-locator.cjs |
Phase-directory search and location β active-phase discovery (searchPhaseInDir, findPhaseInternal) and archived-phase-dir enumeration (getArchivedPhaseDirs), matching phase ids/tokens against the filesystem (extracted from core.cjs, ADR-857) |
roadmap-parser.cjs |
ROADMAP.md parsing β milestone slicing, current-milestone extraction, phase/milestone lookups, milestone-phase filter (extracted from core.cjs, ADR-857) |
planning-workspace.cjs |
Planning seam (planningDir, planningPaths, active workstream routing, .planning/.lock) |
state.cjs |
STATE.md parsing, updating, progression, metrics |
phase.cjs |
Phase directory operations, decimal numbering, plan indexing |
roadmap.cjs |
ROADMAP.md parsing, phase extraction, plan progress |
config.cjs |
config.json read/write, section initialization |
verify.cjs |
Plan structure, phase completeness, reference, commit validation |
template.cjs |
Template selection and filling with variable substitution |
frontmatter.cjs |
YAML frontmatter CRUD operations |
init.cjs |
Compound context loading for each workflow type |
milestone.cjs |
Milestone archival, requirements marking |
commands.cjs |
Misc commands (slug, timestamp, todos, scaffolding, stats) |
model-profiles.cjs |
Model profile resolution table |
model-resolver.cjs |
Model and effort resolution policy β resolves model, tier, granularity, effort, and fast-mode for a given agent from project config and model profiles/catalog (extracted from core.cjs, ADR-857) |
security.cjs |
Path traversal prevention, prompt injection detection, safe JSON parsing, shell argument validation |
uat.cjs |
UAT file parsing, verification debt tracking, audit-uat support |
docs.cjs |
Docs-update workflow init, Markdown scanning, monorepo detection |
workstream.cjs |
Workstream CRUD, migration, session-scoped active pointer |
schema-detect.cjs |
Schema-drift detection for ORM patterns (Prisma, Drizzle, etc.) |
profile-pipeline.cjs |
User behavioral profiling data pipeline, session file scanning |
profile-output.cjs |
Profile rendering, USER-PROFILE.md and dev-preferences.md generation |
loop-host-contract.cjs |
Generated Loop Host Contract β 12 loop points, per-step agent roles, and core artifacts; emitted by scripts/gen-loop-host-contract.cjs from workflow markers (ADR-894 Β§3); consumed by gen-capability-registry.cjs |
capability-registry.cjs |
Generated central Capability Registry β role-partitioned index of all co-located capability declarations; emitted by scripts/gen-capability-registry.cjs (ADR-894 Β§5) |
loop-resolver.cjs |
Loop Extension Point resolver β ADR-857 phase 3c registry-consuming query; consumes resolved Capability State, filters byLoopPoint by capability enablement plus config activation, renders active hooks as markdown, emits { point, activeHooks, rendered } envelope; gsd-tools loop render-hooks <point> [--config-dir <path>] |
capability-state.cjs |
Unified capability-state resolver β ADR-857 phase 4b/6; composes install profile, runtime surface, and config activation into one per-capability view consumed by workflow hook rendering; pure resolveCapabilityState, reusable resolveCapabilityRuntimeState, I/O cmdCapabilityState, and convenience predicate isCapabilityActive(capId, cwd); gsd-tools capability state [--config-dir <path>] emits { runtimeConfigDir, capabilities[] } where each entry carries enabled (installed && surfaced) and active (enabled && configActivation via the capability's activationKey; absent key β active===enabled) |
graphify-command-router.cjs |
ADR-959 capability command router β first real capability command cutover (phase 4d-impl-2); extracted from the case 'graphify': arm in gsd-tools.cjs; dispatches build/query/status/diff subcommands; discovered via commandFamilies in the capability registry |
audit-command-router.cjs |
ADR-959 capability command router (phase 4d-impl-3); extracted from the case 'audit-uat': and case 'audit-open': arms in gsd-tools.cjs; routeAuditUat β uat.cjs:cmdAuditUat, routeAuditOpen β audit.cjs:{auditOpenArtifacts,formatAuditReport}; discovered via commandFamilies in the capability registry |
intel-command-router.cjs |
ADR-959 capability command router (phase 4d-impl-4, last first-party cutover); extracted from the case 'intel': arm in gsd-tools.cjs; routeIntelCommand β all 9 intel subcommands via lazy require('./intel.cjs'); preserves non-raw timeAgo transform on status.files[*].updated_at; discovered via commandFamilies in the capability registry |
runtime-hooks-surface.cjs |
Standalone hook-surface writer module (ADR-857 phase 5f-1); owns Cline rules/agents-md/pre-tool-use hook generation, Cursor hooks.json reconciliation, Copilot session-hook config, and Codex hook-block management; extracted verbatim from bin/install.js with no logic change. |
Orchestrator (workflow .md)
β
βββ Load context: gsd-tools.cjs init <workflow> <phase>
β Returns JSON with: project info, config, state, phase details
β
βββ Resolve model: gsd-tools.cjs resolve-model <agent-name>
β Returns: opus | sonnet | haiku | inherit
β
βββ Spawn Agent (Task/SubAgent call)
β βββ Agent prompt (agents/*.md)
β βββ Context payload (init JSON)
β βββ Model assignment
β βββ Tool permissions
β
βββ Collect result
β
βββ Update state: gsd-tools.cjs state update / state patch / state advance-plan
Conceptual spawn-pattern taxonomy for the primary agents. For the authoritative agent roster (including the advanced/specialized agents such as gsd-pattern-mapper, gsd-code-reviewer, gsd-code-fixer, gsd-ai-researcher, gsd-domain-researcher, gsd-eval-planner, gsd-eval-auditor, gsd-framework-selector, gsd-debug-session-manager, gsd-intel-updater), see docs/INVENTORY.md.
| Category | Agents | Parallelism |
|---|---|---|
| Researchers | gsd-project-researcher, gsd-phase-researcher, gsd-ui-researcher, gsd-advisor-researcher | 4 parallel (stack, features, architecture, pitfalls); advisor spawns during discuss-phase |
| Synthesizers | gsd-research-synthesizer | Sequential (after researchers complete) |
| Planners | gsd-planner, gsd-roadmapper | Sequential |
| Checkers | gsd-plan-checker, gsd-integration-checker, gsd-ui-checker, gsd-nyquist-auditor | Sequential (verification loop, max 3 iterations) |
| Executors | gsd-executor | Parallel within waves, sequential across waves |
| Verifiers | gsd-verifier | Sequential (after all executors complete) |
| Mappers | gsd-codebase-mapper | 4 parallel (tech, arch, quality, concerns) |
| Debuggers | gsd-debugger | Sequential (interactive) |
| Auditors | gsd-ui-auditor, gsd-security-auditor | Sequential |
| Doc Writers | gsd-doc-writer, gsd-doc-verifier | Sequential (writer then verifier) |
| Profilers | gsd-user-profiler | Sequential |
| Analyzers | gsd-assumptions-analyzer | Sequential (during discuss-phase) |
During execute-phase, plans are grouped into dependency waves:
Wave Analysis:
Plan 01 (no deps) ββ
Plan 02 (no deps) ββ€ββ Wave 1 (parallel)
Plan 03 (depends: 01) ββ€ββ Wave 2 (waits for Wave 1)
Plan 04 (depends: 02) ββ
Plan 05 (depends: 03,04) ββ Wave 3 (waits for Wave 2)
Each executor gets:
- Fresh 200K context window (or up to 1M for models that support it)
- The specific PLAN.md to execute
- Project context (PROJECT.md, STATE.md)
- Phase context (CONTEXT.md, RESEARCH.md if available)
When the context window is 500K+ tokens (1M-class models like Opus 4.6, Sonnet 4.6), subagent prompts are automatically enriched with additional context that would not fit in standard 200K windows:
- Executor agents receive prior wave SUMMARY.md files and the phase CONTEXT.md/RESEARCH.md, enabling cross-plan awareness within a phase
- Verifier agents receive all PLAN.md, SUMMARY.md, CONTEXT.md files plus REQUIREMENTS.md, enabling history-aware verification
The orchestrator reads context_window from config (gsd-tools.cjs config-get context_window) and conditionally includes richer context when the value is >= 500,000. For standard 200K windows, prompts use truncated versions with cache-friendly ordering to maximize context efficiency.
When multiple executors run within the same wave, two mechanisms prevent conflicts:
--no-verifycommits β Parallel agents skip pre-commit hooks (which can cause build lock contention, e.g., cargo lock fights in Rust projects). The orchestrator runsgit hook run pre-commitonce after each wave completes.- STATE.md file locking β All
writeStateMd()calls use lockfile-based mutual exclusion (STATE.md.lockwithO_EXCLatomic creation). This prevents the read-modify-write race condition where two agents read STATE.md, modify different fields, and the last writer overwrites the other's changes. Includes stale lock detection (10s timeout) and spin-wait with jitter.
User input (idea description)
β
βΌ
Questions (questioning.md philosophy)
β
βΌ
4x Project Researchers (parallel)
βββ Stack β STACK.md
βββ Features β FEATURES.md
βββ Architecture β ARCHITECTURE.md
βββ Pitfalls β PITFALLS.md
β
βΌ
Research Synthesizer β SUMMARY.md
β
βΌ
Requirements extraction β REQUIREMENTS.md
β
βΌ
Roadmapper β ROADMAP.md
β
βΌ
User approval β STATE.md initialized
discuss-phase β CONTEXT.md (user preferences)
β
βΌ
ui-phase β UI-SPEC.md (design contract, optional)
β
βΌ
plan-phase
βββ Research gate (blocks if RESEARCH.md has unresolved open questions)
βββ Phase Researcher β RESEARCH.md
β βββ Package Legitimacy Gate: slopcheck on every package; [SLOP] removed,
β [SUS]/[ASSUMED] flagged; Audit table written to RESEARCH.md
βββ Planner (with reachability check) β PLAN.md files
β βββ checkpoint:human-verify injected before [ASSUMED]/[SUS] installs;
β T-{phase}-SC STRIDE row added for install-bearing plans
βββ Plan Checker β Verify loop (max 3x)
βββ Requirements coverage gate (REQ-IDs β plans)
βββ Decision coverage gate (CONTEXT.md `<decisions>` β plans, BLOCKING β #2492)
β
βΌ
state planned-phase β STATE.md (Planned/Ready to execute)
β
βΌ
execute-phase (context reduction: truncated prompts, cache-friendly ordering)
βββ Wave analysis (dependency grouping)
βββ Executor per plan β code + atomic commits
βββ SUMMARY.md per plan
βββ Verifier β VERIFICATION.md
βββ Decision coverage gate (CONTEXT.md decisions β shipped artifacts, NON-BLOCKING β #2492)
β
βΌ
verify-work β UAT.md (user acceptance testing)
β
βΌ
ui-review β UI-REVIEW.md (visual audit, optional)
Each workflow stage produces artifacts that feed into subsequent stages:
PROJECT.md βββββββββββββββββββββββββββββββββββββββββββββΊ All agents
REQUIREMENTS.md ββββββββββββββββββββββββββββββββββββββββΊ Planner, Verifier, Auditor
ROADMAP.md βββββββββββββββββββββββββββββββββββββββββββββΊ Orchestrators
STATE.md βββββββββββββββββββββββββββββββββββββββββββββββΊ All agents (decisions, blockers)
CONTEXT.md (per phase) βββββββββββββββββββββββββββββββββΊ Researcher, Planner, Executor
RESEARCH.md (per phase) ββββββββββββββββββββββββββββββββΊ Planner, Plan Checker
PLAN.md (per plan) βββββββββββββββββββββββββββββββββββββΊ Executor, Plan Checker
SUMMARY.md (per plan) ββββββββββββββββββββββββββββββββββΊ Verifier, State tracking
UI-SPEC.md (per phase) βββββββββββββββββββββββββββββββββΊ Executor, UI Auditor
~/.claude/ # Claude Code (global install)
βββ skills/gsd-ns-*/SKILL.md # Global skills β nesting runtimes: 6 namespace routers (authoritative roster: docs/INVENTORY.md)
β βββ skills/<name>/SKILL.md # concrete skills nested under each router
β (flat runtimes: skills/gsd-*/SKILL.md β all ~67 skills at top level)
βββ commands/gsd/*.md # Local Claude installs use slash commands instead of global skills
βββ gsd-core/
β βββ bin/gsd-tools.cjs # CLI utility
β βββ bin/lib/*.cjs # Domain modules (authoritative roster: docs/INVENTORY.md)
β βββ workflows/*.md # Workflow definitions (authoritative roster: docs/INVENTORY.md)
β βββ references/*.md # Shared reference docs (authoritative roster: docs/INVENTORY.md)
β βββ templates/ # Planning artifact templates
βββ agents/*.md # Agent definitions (authoritative roster: docs/INVENTORY.md)
βββ hooks/*.js # Node.js hooks (statusline, guards, monitors, update check)
βββ hooks/*.sh # Shell hooks (session state, commit validation, phase boundary)
βββ settings.json # Hook registrations
βββ VERSION # Installed version number
Equivalent paths for other runtimes:
- OpenCode:
~/.config/opencode/global or./.opencode/local - Kilo:
~/.config/kilo/global or./.kilo/local - Gemini CLI:
~/.gemini/global or./.gemini/local - Kimi CLI: first-existing generic global root (
~/.config/agents/recommended, then~/.agents/if itsskills/directory already exists); local install is deferred and guarded - Codex:
~/.codex/global or./.codex/local - Copilot:
~/.copilot/global or./.github/local - Antigravity: auto-detected global root (
~/.gemini/antigravity/,~/.gemini/antigravity-ide/, or~/.gemini/antigravity-cli/) or./.agent/local - Cursor:
~/.cursor/global or./.cursor/local - Windsurf/Devin Desktop:
~/.codeium/windsurf/global or./.devin/local (canonical, #1085);./.windsurf/local is still recognized as legacy - Augment Code:
~/.augment/global or./.augment/local - Trae:
~/.trae/global or./.trae/local - Qwen Code:
~/.qwen/global or./.qwen/local - Hermes Agent:
~/.hermes/global or./.hermes/local - CodeBuddy:
~/.codebuddy/global or./.codebuddy/local - Cline:
~/.cline/global or project-root.clineruleslocal
.planning/
βββ PROJECT.md # Project vision, constraints, decisions, evolution rules
βββ REQUIREMENTS.md # Scoped requirements (v1/v2/out-of-scope)
βββ ROADMAP.md # Phase breakdown with status tracking
βββ STATE.md # Living memory: position, decisions, blockers, metrics
βββ config.json # Workflow configuration
βββ MILESTONES.md # Completed milestone archive
βββ research/ # Domain research from /gsd-new-project
β βββ SUMMARY.md
β βββ STACK.md
β βββ FEATURES.md
β βββ ARCHITECTURE.md
β βββ PITFALLS.md
βββ codebase/ # Brownfield mapping (from /gsd-map-codebase)
β βββ STACK.md # YAML frontmatter carries `last_mapped_commit`
β βββ ARCHITECTURE.md # for the post-execute drift gate (#2003)
β βββ CONVENTIONS.md
β βββ CONCERNS.md
β βββ STRUCTURE.md
β βββ TESTING.md
β βββ INTEGRATIONS.md
βββ phases/
β βββ XX-phase-name/
β βββ XX-CONTEXT.md # User preferences (from discuss-phase)
β βββ XX-RESEARCH.md # Ecosystem research (from plan-phase)
β βββ XX-YY-PLAN.md # Execution plans
β βββ XX-YY-SUMMARY.md # Execution outcomes
β βββ XX-VERIFICATION.md # Post-execution verification
β βββ XX-VALIDATION.md # Nyquist test coverage mapping
β βββ XX-UI-SPEC.md # UI design contract (from ui-phase)
β βββ XX-UI-REVIEW.md # Visual audit scores (from ui-review)
β βββ XX-UAT.md # User acceptance test results
βββ quick/ # Quick task tracking
β βββ YYMMDD-xxx-slug/
β βββ PLAN.md
β βββ SUMMARY.md
βββ todos/
β βββ pending/ # Captured ideas
β βββ done/ # Completed todos
βββ threads/ # Persistent context threads (from /gsd-thread)
βββ seeds/ # Forward-looking ideas (from /gsd-capture --seed)
βββ debug/ # Active debug sessions
β βββ *.md # Active sessions
β βββ resolved/ # Archived sessions
β βββ knowledge-base.md # Persistent debug learnings
βββ ui-reviews/ # Screenshots from /gsd-ui-review (gitignored)
βββ continue-here.md # Context handoff (from pause-work)
After the last wave of /gsd-execute-phase commits, the workflow runs a
non-blocking codebase_drift_gate step (between schema_drift_gate and
verify_phase_goal). It compares the diff last_mapped_commit..HEAD
against .planning/codebase/STRUCTURE.md and counts four kinds of
structural elements:
- New directories outside mapped paths
- New barrel exports at
(packages|apps)/<name>/src/index.* - New migration files
- New route modules under
routes/orapi/
If the count meets workflow.drift_threshold (default 3), the gate either
warns (default) with the suggested /gsd-map-codebase --paths β¦ command,
or auto-remaps (workflow.drift_action = auto-remap) by spawning
gsd-codebase-mapper scoped to the affected paths. Any error in detection
or remap is logged and the phase continues β drift detection cannot fail
verification.
last_mapped_commit lives in YAML frontmatter at the top of each
.planning/codebase/*.md file; bin/lib/drift.cjs provides
readMappedCommit and writeMappedCommit round-trip helpers.
The installer (bin/install.js, ~10,700 lines) handles:
- Runtime detection β Interactive prompt or CLI flags (
--claude,--opencode,--gemini,--kimi,--kilo,--codex,--copilot,--antigravity,--cursor,--windsurf,--augment,--trae,--qwen,--hermes,--codebuddy,--cline,--all) - Location selection β Global (
--global) or local (--local) - File deployment β Copies commands, skills, workflows, references, templates, agents, and hooks
- Runtime adaptation β Transforms file content per runtime:
- Claude Code: Uses as-is
- OpenCode: Converts commands/agents to OpenCode-compatible flat command + subagent format
- Kilo: Reuses the OpenCode conversion pipeline with Kilo config paths
- Codex: Generates TOML config + skills from commands
- Kimi CLI: Generates Agent Skills under
skills/gsd-*/SKILL.md, custom agent YAML/prompt files, and explicitkimi_cli.tools.*module paths - Copilot: Maps tool names (Readβread, Bashβexecute, etc.)
- Gemini: Adjusts hook event names (
AfterToolinstead ofPostToolUse) - Antigravity: Skills-first with Google model equivalents
- Cursor: Skills-first with Cursor rule references
- Windsurf: Skills-first with Windsurf rule references
- Trae: Skills-first install to
~/.trae/./.traewith nosettings.jsonor hook integration - Qwen Code: Skills-first with Qwen-branded path and prompt rewrites
- Hermes Agent: Category-based skills under
skills/gsd/ - CodeBuddy: Skills-first with CodeBuddy path and prompt rewrites
- Cline: Writes
.clinerulesfor rule-based integration - Augment Code: Skills-first with full skill conversion and config management
- Path normalization β Replaces
~/.claude/paths with runtime-specific paths - Settings integration β Registers hooks in runtime's
settings.json - Patch backup β Since v1.17, backs up locally modified files to
gsd-local-patches/for/gsd-update --reapply - Manifest tracking β Writes
gsd-file-manifest.jsonfor clean uninstall - Uninstall mode β
--uninstallremoves all GSD files, hooks, and settings
Install-time file moves, stale-artifact cleanup, config rewrites, and user-data preservation are governed by the Installer Migration Module. See Installer Migrations and ADR 0008. The migration module also owns the gated first-time baseline scan for legacy installs, classifying known runtime install surfaces before later migrations remove or rewrite anything.
The plan drift guard (plan_review.source_grounding) β which verifies symbol references in generated plans against live source before execution β is specified in ADR 22.
- Windows:
windowsHideon child processes, EPERM/EACCES protection on protected directories, path separator normalization - WSL: Detects Windows Node.js running on WSL and warns about path mismatches
- Docker/CI: Supports
CLAUDE_CONFIG_DIRenv var for custom config directory locations
Runtime Engine (Claude Code / Gemini CLI)
β
βββ statusLine event βββΊ gsd-statusline.js
β Reads: stdin (session JSON)
β Writes: stdout (formatted status), /tmp/claude-ctx-{session}.json (bridge)
β
βββ PostToolUse/AfterTool event βββΊ gsd-context-monitor.js
β Reads: stdin (tool event JSON), /tmp/claude-ctx-{session}.json (bridge)
β Writes: stdout (hookSpecificOutput with additionalContext warning)
β
βββ SessionStart event
ββββΊ gsd-ensure-canonical-path.js (runs first)
β Reads: ${CLAUDE_PLUGIN_ROOT}/gsd-core/ (plugin installs only)
β Writes: ~/.claude/gsd-core/{bin,contexts,references,templates,workflows} symlinks
β (no-op in classic installs; preserves user files; self-heals)
ββββΊ gsd-check-update.js
Reads: VERSION file
Writes: ~/.claude/cache/gsd-update-check.json (spawns background process)
| Remaining Context | Level | Agent Behavior |
|---|---|---|
| > 35% | Normal | No warning injected |
| β€ 35% | WARNING | "Avoid starting new complex work" |
| β€ 25% | CRITICAL | "Context nearly exhausted, inform user" |
Debounce: 5 tool uses between repeated warnings. Severity escalation (WARNINGβCRITICAL) bypasses debounce.
- All hooks wrap in try/catch, exit silently on error
- stdin timeout guard (3s) prevents hanging on pipe issues
- Stale metrics (>60s old) are ignored
- Missing bridge files handled gracefully (subagents, fresh sessions)
- Context monitor is advisory β never issues imperative commands that override user preferences
The researcher β planner β executor pipeline includes a supply-chain gate against slopsquatting (AI-hallucinated package names pre-registered with malicious post-install scripts).
Threat model: GSD automates the full path from "researcher names a package" to "executor runs npm install". A hallucinated name that passes npm view (proving only registration, not legitimacy) would previously flow through undetected. ~20% of AI-generated package references are hallucinated; ~43% of those names recur consistently across prompts, making pre-registration economically viable for attackers.
Gate layers:
| Layer | Component | Action |
|---|---|---|
| Research | gsd-phase-researcher |
Runs slopcheck install <pkgs> --json; writes ## Package Legitimacy Audit table to RESEARCH.md; strips [SLOP] packages before RESEARCH.md is written |
| Planning | gsd-planner |
Reads Audit table; inserts checkpoint:human-verify before any [ASSUMED] or [SUS] install task; adds T-{phase}-SC STRIDE supply-chain row to <threat_model> |
| Execution | gsd-executor |
RULE 3 excludes package installation from auto-fix scope; failed installs surface as checkpoints, never silent substitutions |
Claim provenance integration: Package names discovered via WebSearch are tagged [ASSUMED] (not [VERIFIED]) regardless of npm view result. This extends the existing [ASSUMED] / [VERIFIED] / [CITED] provenance system by enforcing the provenance tag as a hard gate at the install boundary β [ASSUMED] always generates a checkpoint:human-verify in PLAN.md.
Ecosystem coverage: The researcher uses registry-specific verification commands β npm view (Node), pip index versions (Python), cargo search (Rust) β rather than a single generic check. This catches cross-ecosystem hallucination (~9% rate documented in 2025 USENIX research).
Graceful degradation: If slopcheck is unavailable, every recommended package is tagged [ASSUMED] and gated with a checkpoint. Research and planning proceed; the system never hard-fails on a missing tool dependency.
External dependency: slopcheck (MIT, pip-installable). If abandoned, the [ASSUMED]-gate fallback maintains human-checkpoint coverage.
For a conceptual overview of how the hook and guard layers fit into the broader security approach, see Security model.
Prompt Guard (gsd-prompt-guard.js):
- Triggers on Write/Edit to
.planning/files - Scans content for prompt injection patterns (role override, instruction bypass, system tag injection)
- Advisory-only β logs detection, does not block
- Patterns are inlined (subset of
security.cjs) for hook independence
Workflow Guard (gsd-workflow-guard.js):
- Triggers on Write/Edit to non-
.planning/files - Detects edits outside GSD workflow context (no active
/gsd-command or Task subagent) - Advises using
/gsd-quickor/gsd-fastfor state-tracked changes - Opt-in via
hooks.workflow_guard: true(default: false)
GSD supports multiple AI coding runtimes through a unified command/workflow architecture:
This matrix describes the runtime surfaces the installer materializes today. The migration-specific ownership and source snapshots live in Installer Migrations.
| Runtime | Global root | Local root | Invocation surface | Agent surface | Config and hooks |
|---|---|---|---|---|---|
| Claude Code | ~/.claude |
./.claude |
Global skills/gsd-ns-*/SKILL.md (6 routers) + skills/gsd-ns-*/skills/<name>/SKILL.md (nested concretes); local commands/gsd/*.md |
agents/gsd-*.md |
settings.json hook and statusLine entries |
| OpenCode | ~/.config/opencode |
./.opencode |
command/gsd-*.md |
agents/gsd-*.md |
opencode.json or opencode.jsonc; no GSD hooks |
| Kilo | ~/.config/kilo |
./.kilo |
command/gsd-*.md |
agents/gsd-*.md |
kilo.json or kilo.jsonc; no GSD hooks |
| Gemini CLI | ~/.gemini |
./.gemini |
commands/gsd/*.toml |
agents/gsd-*.md |
settings.json feature flag, hooks, and statusline |
| Kimi CLI | First-existing generic root: ~/.config/agents recommended, then ~/.agents when ~/.agents/skills exists and ~/.config/agents/skills does not |
Deferred and guarded | skills/gsd-*/SKILL.md (flat) invoked as /skill:gsd-* |
agents/gsd.yaml, agents/gsd.md, and agents/subagents/gsd-* YAML/prompt pairs |
Explicit kimi --agent-file <configRoot>/agents/gsd.yaml; no GSD hooks or statusline |
| Codex | ~/.codex |
./.codex |
skills/gsd-*/SKILL.md (flat) |
agents/ source markdown plus per-agent TOML |
config.toml [agents.gsd-*], [features].hooks (canonical; legacy alias codex_hooks is recognized and migrated forward on reinstall, #3566), and hook tables |
| GitHub Copilot | ~/.copilot |
./.github |
skills/gsd-*/SKILL.md (flat), copilot-instructions.md, and AGENTS.md (repo root, local) |
.agent.md files |
Self-contained sessionStart hook (hooks/gsd-session.json, inline command type); no statusline |
| Antigravity | auto-detected: ~/.gemini/antigravity, ~/.gemini/antigravity-ide, or ~/.gemini/antigravity-cli |
./.agent |
skills/gsd-ns-*/SKILL.md (6 routers) + skills/gsd-ns-*/skills/<name>/SKILL.md (nested concretes) |
agents/gsd-*.md |
Gemini-style settings.json hook entries when installed by GSD |
| Cursor | ~/.cursor |
./.cursor |
skills/gsd-*/SKILL.md (flat) |
agents/gsd-*.md |
Rule references under rules/; hooks.json with sessionStart context injection and postToolUse STATE.md monitor (#777) |
| Windsurf | ~/.codeium/windsurf |
./.devin (canonical, #1085); ./.windsurf legacy recognized |
skills/gsd-*/SKILL.md (flat) |
agents/gsd-*.md |
Rule references under rules/; no GSD hooks |
| Augment Code | ~/.augment |
./.augment |
skills/gsd-ns-*/SKILL.md (6 routers) + skills/gsd-ns-*/skills/<name>/SKILL.md (nested concretes) |
agents/gsd-*.md |
No GSD hooks or statusline |
| Trae | ~/.trae |
./.trae |
skills/gsd-ns-*/SKILL.md (6 routers) + skills/gsd-ns-*/skills/<name>/SKILL.md (nested concretes) |
agents/gsd-*.md |
Rule references under rules/; no GSD hooks |
| Qwen Code | ~/.qwen |
./.qwen |
skills/gsd-ns-*/SKILL.md (6 routers) + skills/gsd-ns-*/skills/<name>/SKILL.md (nested concretes) |
agents/gsd-*.md |
Common GSD settings and hook entries where supported |
| Hermes Agent | ~/.hermes |
./.hermes |
skills/gsd/ns-*/SKILL.md (6 routers, prefix='') + skills/gsd/ns-*/skills/<name>/SKILL.md (nested concretes) |
agents/gsd-*.md |
Common GSD settings and hook entries where supported |
| CodeBuddy | ~/.codebuddy |
./.codebuddy |
skills/gsd-*/SKILL.md (flat, user-invocable: false) |
agents/gsd-*.md |
/gsd-* slash commands under commands/; common GSD settings and hook entries where supported |
| Cline | ~/.cline |
project root | skills/gsd-ns-*/SKILL.md (6 routers) + skills/gsd-ns-*/skills/<name>/SKILL.md (nested concretes) + .clinerules |
Rules only | No GSD hooks or statusline |
Runtime install expectations are checked against primary documentation where available. The current source snapshot is 2026-05-11, with Kimi CLI rechecked on 2026-06-07:
- Claude Code: Anthropic slash commands, settings, hooks, and subagents docs.
- OpenCode and Kilo: OpenCode config docs and Kilo custom subagent docs.
- Gemini CLI and Qwen Code: command/config docs; Qwen command docs were last updated 2026-05-06.
- Kimi CLI: Agent Skills docs for user-level brand roots and first-existing
generic roots (
~/.config/agents/skills/recommended, then~/.agents/skills/), plus Agents docs for YAML files,system_prompt_path,kimi_cli.tools.*module paths, and explicitkimi --agent-filelaunch. - Codex: OpenAI Codex docs and
config-schema.json; the installer also carries Codex 0.124.0 compatibility for agent table shape. - Copilot, Cursor, Cline, Augment, Hermes, and CodeBuddy: vendor docs for custom instructions, rules, skills, or config.
- Antigravity, Windsurf, and Trae: source-limited rows. The installer documents current compatibility shims, and migrations must refresh those sources before rewriting their config.
- Tool name mapping β Each runtime has its own tool names (e.g., Claude's
Bashβ Copilot'sexecute) - Hook event names β Claude uses
PostToolUse, Gemini usesAfterTool - Agent frontmatter β Each runtime has its own agent definition format
- Path conventions β Each runtime stores config in different directories
- Model references β
inheritprofile lets GSD defer to runtime's model selection
The installer handles all translation at install time. Workflows and agents are written in Claude Code's native format and transformed during deployment.