A Claude Code plugin markteplace to help you start applying effective agentic coding patterns in your projects.
Agentic coding with Claude Code might seem "trivial" at first, as by just writing a prompt it seems that it all "just works". But in reality is not that simple... let me try to explain why.
Writing good prompts is hard, but it's the first step to get good results, specially at scale or in non-greenfield projects. What you pass to the LLM matters a lot as it will steer the conversation, it's has a butterfly effect.
The set of plugins offered here aim to help reducing the effect of a bad prompt, by providing ways to iterate, breakdown and reduce the compounding effect of a bad prompt by splitting the development process in smaller steps.
On the other hand, leaving CC running for long periods of times yield to high context usage, unwanted compactifications and overall needle in a haystack situations. For that, the process mentioned above in which we break down the development in smaller steps helps a lot to control that.
Also, as a general rule, emprical research has shown that having context usage above 40% tends to yield bad results by default (see DEX videos below for more details).
Here's an article that I wrote regarding context engineering.
Finally, some of the keys of the proposed agentic coding patterns described in these plugins are to forcefully insert human in the loop at key points of the development process.
This is important, as some of the patterns described here go "against" one-shotting your solutions, but rather do an iterative "gradien descent" approach to get to the desired solution.
From inside Claude Code, run:
/plugin marketplace add desplega-ai/ai-toolboxor from the terminal
claude plugin marketplace add desplega-ai/ai-toolboxThen install the plugin inside it with:
/plugin install desplega@desplega-ai-toolboxBy default, marketplace add tracks main and auto-updates. To pin to a specific release, append @<tag>:
/plugin marketplace add desplega-ai/ai-toolbox@cc-desplega-1.13.0Tag scheme: cc-desplega-<semver>. Current pinnable tags:
| Tag | Notes |
|---|---|
cc-desplega-2.0.0 |
Current latest. Shape D planning, three Success Criteria buckets, v-skills (DAG plans), step-running. |
cc-desplega-1.13.0 |
Last 1.x release. Pin here if you want pre-v2 behavior. |
v2.0 breaking changes (released — pin to cc-desplega-1.13.0 if your workflow depends on v1 contracts):
planningskill restructured around "Setup + 10 Rules" (Shape D); old "Process Steps" headings are gone.- Plan templates use three Success Criteria buckets: Automated Verification + Automated QA + Manual Verification (was two).
### QA Spec (optional):block now links to an externaldesplega:qadoc — inline test scenarios moved out.- New
desplega:ask-userskill consolidatesAskUserQuestionconventions; other skills point at it instead of duplicating boilerplate. - File-review is always-on by default (was a per-run preference question).
- New skills:
v-planning/v-implementing(DAG plans for parallel execution) +step-running(atomic step sub-agent). - Consumer-skill API changes:
phase-runningreports three buckets;implementinghandlesQA Doc: <path>by invokingdesplega:qa.
Inside you will find:
- commands - Entrypoint commands, the important part
- agents - Sub-agents to be used by the commands
- skills
- hooks - Lifecycle hooks (context-window pressure warnings,
thoughts/validation, plan-checkbox tracking)
| Command | Description |
|---|---|
research |
Document codebase as-is with thoughts directory |
create-plan |
Create detailed implementation plans through research and iteration |
create-tdd-plan |
Create TDD implementation plans with Red-Green-Commit cycles |
implement-plan |
Execute approved plans phase by phase |
brainstorm |
Interactive Socratic Q&A exploration of ideas |
question |
One-shot question answering using the research process |
review |
Structured critique of research, plan, and brainstorm documents |
verify-plan |
Post-implementation plan verification and audit |
qa |
Functional validation with test evidence and QA reports |
run-phase |
Execute a single plan phase as a background sub-agent |
commit |
Create git commits for session changes |
continue-handoff |
Continue work from a saved handoff file |
learning |
Capture, search, and promote institutional learnings across projects |
bu-auto-instrument |
Auto-instrument Business-Use SDK tracking |
script-builder |
Generate durable validation scripts from testing intent |
| Skill | Description |
|---|---|
researching |
Comprehensive codebase research with parallel sub-agents |
planning |
Interactive plan creation with research and iteration |
tdd-planning |
TDD-focused planning with Red-Green-Commit cycles |
implementing |
Phase-by-phase plan execution with verification |
brainstorming |
Socratic Q&A exploration producing pre-PRD documents |
questioning |
One-shot Q&A using the research process, no document generated |
reviewing |
Structured critique with severity categorization |
verifying |
Post-implementation audit against plan |
qa |
Functional validation capturing evidence into thoughts/*/qa/ |
phase-running |
Atomic phase execution as background sub-agent |
learning |
Compounding knowledge via tiered backends (local/qmd/agent-fs) |
script-builder |
Generate TS/Python/Bash validation scripts with PASS/FAIL + /tmp log convention |
The plugin also registers a few lifecycle hooks (see hooks/ and the hooks block in .claude-plugin/plugin.json):
- Context-window pressure warnings (
context_warn.pyonUserPromptSubmit,stop_confirm.pyonStop) — the enforcement arm of "Problem 2" above. As the session fills up they nudge you to offload to sub-agents, persist progress tothoughts/, and avoid/compact(so those files keep their value); at high pressure they pause the session so you can hand off to a fresh one instead of grinding on in a degraded context. thoughts/validation (validate-thoughts.pyonWrite|Edit) — keeps thoughts-directory writes well-formed.- Plan checkbox tracking (
plan_checkbox_*.py) — keeps plan progress in sync duringimplement.
Window detection. The warnings scale to the model's real context window. Large-window models (Opus 4.6+, Sonnet 4.x) are treated as 1M; Haiku, Opus 4.5 and older, and non-Claude models as 200k. Claude Code strips the [1m] variant from everything it writes to disk, so detection layers model family + env signals + observed peak usage (>200k proves a 1M window) — see comments in hooks/context_state.py.
Thresholds — level × window
1M window (Opus 4.6+ / Sonnet 4.x — absolute token cutoffs):
| Level | Tokens | % of 1M |
|---|---|---|
ok |
0 – 200k | ≤ 20% |
warn |
200k – 350k | 20 – 35% |
severe |
350k – 500k | 35 – 50% |
yolo |
500k+ | > 50% |
200k window (Haiku / Opus 4.5 & older / non-Claude — percentage-based):
| Level | % of window | Tokens |
|---|---|---|
ok |
< 40% | 0 – 80k |
warn |
40 – 60% | 80k – 120k |
severe |
> 60% | 120k+ |
yolo |
— | not reachable |
What each level does
| Level | UserPromptSubmit (nudge) |
Stop |
|---|---|---|
ok |
silent | allow |
warn |
heads-up: start offloading — sub-agents for research, persist to thoughts/, avoid /compact |
allow |
severe |
"stop next non-readonly action" + AskUserQuestion: continue / hand off / yolo |
blocks until you choose |
yolo (1M only) |
strongly advises handing off to a fresh session | blocks |
Each level fires once per upward crossing (throttled per session). If you explicitly opt into "yolo this session", severe/yolo degrade to a silent one-line usage note — no pause. The 40% warn threshold on the 200k window is deliberately the same "~40% context" line called out in Problem 2.
The complete workflow chain:
flowchart LR
B[brainstorm]:::stage
Q[question]:::aux
R[research]:::stage
P["plan<br/><i>or create-tdd-plan</i>"]:::stage
I[implement]:::stage
PR[run-phase]:::aux
V[verify-plan]:::stage
QA[qa]:::parallel
SB[script-builder]:::aux
L[learning]:::aux
RV[review]:::aux
B -.->|clear context| R
R -.->|clear context| P
P -.->|clear context| I
I -.->|clear context| V
B --> Q --> R
R --> P
P --> I
I --> PR --> I
I --> V
P -.-> QA
I -.-> QA
QA -.->|findings| P
I --- SB
V -.->|revise| P
RV -.->|critique| P
classDef stage fill:#1f6feb,stroke:#0b3a8f,color:#fff,font-weight:bold;
classDef aux fill:#eef2f7,stroke:#6b7280,color:#111;
classDef parallel fill:#f59e0b,stroke:#92400e,color:#111;
Context control (the dotted "clear context" arrows): after each major stage (brainstorm, research, plan, implement), start a new Claude Code session before the next one. The thoughts/ file produced by each stage is the handoff — that's why every stage writes to disk. Empirically, context above ~40% yields noticeably worse results (see "Problem 2" above), so clearing between stages is a load-bearing part of the workflow, not an optimization.
Variants and helpers:
create-tdd-planis a drop-in variant ofcreate-planwith strict Red-Green-Commit cycles — use it when you want TDD discipline baked into the phases.qaruns in parallel withplanandimplement: start it alongside to produce functional test evidence while planning/implementation is in flight; findings feed back into the plan.script-builderis invoked insideimplement(or anywhere you want durable validation) to turn throwaway bash into re-runnable PASS/FAIL scripts.learningis out-of-band — capture reusable knowledge whenever you notice a pattern worth keeping across runs.reviewcan be invoked at any stage to critique a document before moving on.questionis an optional one-shot shortcut before committing to fullresearch.
Highly inspired on Humanlayer and it's github repository humanlayer/humanlayer. Highly recommend checking it out!
Also, some of the videos from DEX, here are some good ones to start with:
- Advanced Context Engineering for Agents
- 12-Factor Agents: Patterns of reliable LLM applications — Dex Horthy, HumanLayer
Also you should check the 12 factor agents repository.
MIT, some commands Apache 2.0 (check each file for details).