Skip to content

open-horizon-labs/skills

Repository files navigation

Open Horizons Skills

Skills that ground AI agents in strategic context: frame the problem before solving it, capture learning that persists, ship outcomes not just outputs.

Full framework docs or just install and start.

Installation

npx skills add open-horizon-labs/skills -g -a claude-code -y

Re-run the install command to pull the latest version.

The Framework

10 commands forming the language of strategic execution:

Setup (When starting or when aims shift)

Skill Description
/teach-oh Project setup. Explore codebase, ask about strategy/aims/reviewers/practices, write context to AGENTS.md

Grounding (Understand the problem)

Skill Description
/aim Clarify the outcome you want—a change in behavior, not a feature
/problem-space Map what we're optimizing and what constraints we treat as real
/problem-statement Define the framing. Change the statement, change the solution space

Execution (Build & ship)

Skill Description
/solution-space Explore candidate solutions. Band-aid, optimize, reframe, or redesign?
/execute Do the work. Pre-flight, build, detect drift, salvage if needed
/ship Deliver to users. Optimize the path from code to working install

Reflection (Capture learning)

Skill Description
/review Check work and detect drift. A second opinion before committing, with code, writing, and learning lenses
/dissent Devil's advocate. Seek contrary evidence before locking in
/salvage Extract learning before restarting. Code is a draft; learning is the asset
/distill Curate accumulated metis across sessions. Surface patterns, promote to guardrails, compact noise

The Loop

Intent (aim, problem-space, problem-statement)
    │
    ▼
Execution (solution-space, execute, ship)
    │
    ▼
Review (review, dissent)
    │
    ▼ (if drift detected)
SALVAGE ──► back to Problem Space with new understanding

Periodically (between sessions):
DISTILL ──► curate accumulated metis, promote patterns to guardrails

Workflow Contracts

The workflow is strongest when each phase preserves one kind of shared meaning and leaves explicit fields the next phase can reuse rather than re-inferring them from prose. In practice, the carry-forward contract is:

  • /teach-oh preserves project-specific reorientation norms across sessions
  • /aim preserves intent: behavior change, why it matters, mechanism-as-hypothesis, assumptions, feedback signal, guardrails, and misunderstanding evidence
  • /problem-space preserves the situation map: terrain, real constraints, hidden assumptions, frame-stress signals, and open questions
  • /problem-statement preserves the frame: plausible working story, constraints, assumptions, pressure test, and invalidation signal
  • /solution-space preserves optionality: materially different candidates, interpretive variety, accepted trade-offs, and adversarial risk-retirement checks
  • /execute preserves execution boundaries under pressure: success criteria, constraints, drift handling, adversarial checks that would fail the tempting local patch, and decisions that need review, dissent, or human judgment
  • /ship preserves contact with reality: target-environment evidence and learning routed back to aim, problem-space, guardrails, or metis
  • /review preserves independent judgment: result vs. aim, contract drift, incomplete risk retirement, and frame collapse routed separately from ordinary implementation drift
  • /dissent preserves contested meaning: contrary evidence, stress-tested story, and a shared decision point
  • /salvage preserves learning from failure: frame shifts, ownership/coordination breakdowns, and durable metis
  • /distill preserves metis quality: patterns promoted by evidence, mythology compacted or discarded

Outcome alignment: strategic grounding and productive friction come from preserving intent, constraints, assumptions, optionality, and dissent before momentum hardens; redesign over recurring patches comes from frame-stress, frame-collapse, and salvage routes that expose broken interpretations instead of patching symptoms; persistent learning across sessions comes from teach-oh, salvage, and distill carrying situated metis, role boundaries, and guardrails forward. Across the tree, each artifact should satisfy necessity, viability, sufficiency, and connectedness: why this exists, why this path can work, why it is enough for now, and how it connects to the level above and below.

Every step should pair a strategy (what and why) with a tactic (how). Tactics that do not ladder back to a parent objective are busy work; objectives without tactics are wishful thinking.

Phase shifts should be treated as commitment decisions, not momentum. Moving from exploration to execution means stating why you're ready to deepen, what would invalidate the current direction, and what should trigger a stop or pivot.

Context should privilege situated metis over generic advice. Persist the non-obvious lessons, constraints, and local patterns your experience adds; foundational-model knowledge that adds no local leverage is noise and pollutes the context space.

Skill prompts should stay hot-path sized. Move conditional detail into JIT references and load it only when it changes the decision or output quality. More context is not automatically more clarity.

Contract shape is executable. When changing skill output templates, session handoff text, or generated agent copies, run:

python3 scripts/validate_skill_contracts.py

The validator guards against the CodeRabbit class of failures: emitted section anchors drifting from persisted session anchors, named obligations disappearing between phases, and generated agent prompts losing the source skill contract.

Behavioral Evidence

We test this workflow against a companion sensemaking benchmark: small implementation tasks where visible tests pass easily, but hidden checks catch the model stopping at the first plausible fix. The point is not to prove universal intelligence. The point is to ask a practical question: does the same model/runner behave better when the harness forces aim, framing, solution comparison, and adversarial risk retirement?

The companion bench is open-horizon-labs/sensemaking-bench; the calibrated comparison used dashboard-cache-trap, notification-third-flag, and timeout-symptom-trap.

On the calibrated baseline-failure set, the answer was yes. In the initial same-runner/same-model comparison, hidden pass rate moved from 1/9 without skills to 7/9 with the current adversarial risk-retirement flow.

Condition Hidden pass What changed
No skills 1/9 Agent usually made visible tests pass and stopped
Risk-routing skills 2/9 Agent named risks, but could still retire them with weak evidence
Adversarial risk-retirement skills 7/9 Agent had to name the tempting wrong patch and add checks that would fail it

Behaviorally, a skilled harness outperformed the unskilled one by changing the stopping condition: not "visible tests pass," but "visible tests pass, model-checkable risks are retired by checks that would fail the tempting shortcut, and remaining risks are explicitly out of model-checkable scope."

The strongest gains came from tasks where the user request pointed at a symptom: duplicate notifications and timeout increases. The remaining weakness was narrower: dashboard payload-shape tasks still failed in some runs when agents removed obvious internal fields but did not derive the exact public field allowlist.

Scope of the claim: the benchmark result is evidence for the high-reasoning model/runner used in the trials. The result JSON did not record a public model label, so this README intentionally does not claim universal performance across every model or harness.

What This Workflow Cannot Do

This workflow improves reasoning discipline: it makes aims explicit, surfaces assumptions, adds friction before commitment, and preserves context across phases. It does not make model reasoning inspectable, force logical soundness, or eliminate the gap between rhetorical and logical completeness. External verification, human judgment, and independent challenge remain necessary. The workflow makes those things easier to practice — it does not replace them.

Adaptive Skills

Each skill works at multiple levels:

  1. Base - Works with just the prompt (no dependencies)
  2. With .oh/ session - Reads/writes .oh/<session>.md for context handoff between skills
  3. With RNA MCP - Semantic code search, graph traversal, outcome-to-code joins, and business context. Skills use RNA tools automatically when available.
  4. With OH MCP - Full integration with Open Horizons organizational graph

Session Persistence

Skills share context via session files. Provide a name explicitly or accept the suggested one (derived from git branch):

/aim auth-refactor           # Creates .oh/auth-refactor.md
/problem-statement auth-refactor  # Reads aim, writes problem statement
/solution-space auth-refactor     # Reads aim + problem statement, writes solution

Phase-Aware Hook (OMP)

For oh-my-pi users, an optional hook detects where you are in the development cycle and suggests the right skill.

How it works:

  1. Reads your .oh/ session files to detect completed phases (state — primary signal)
  2. Scans your prompt for intent signals (enrichment — used when state is ambiguous)
  3. Injects a <oh-phase-context> block recommending the next skill
  4. Deduplicates — only injects when the recommendation changes

Install manually:

mkdir -p .omp/hooks
cp hooks-omp/oh-skills-phase.ts .omp/hooks/

Or via /teach-oh: offered during project setup, including .oh/skills-config.json customization.

Configuration (.oh/skills-config.json, optional):

{
  "projectSkills": ["aim", "solution-space", "execute", "review", "dissent"],
  "disabledSkills": [],
  "phaseOverrides": {
    "execute": ["dissent"]
  }
}
  • projectSkills — limit which skills get suggested (default: all)
  • disabledSkills — skills to never suggest
  • phaseOverrides — extra skills to suggest during specific phases

Config is loaded once at session start. Changes require restarting OMP.

The hook is entirely optional. Skills work the same with or without it.

Note: This hook requires the OMP hook API (@oh-my-pi/pi-coding-agent). It is a TypeScript module that OMP loads at runtime — it cannot be compiled or tested independently within this repo. It lives here alongside the skills it serves, but its runtime home is .omp/hooks/.

Phase Agents (OMP)

Pre-built agent wrappers in agents-omp/ give each phase its own isolated context and scoped tools. Install via /teach-oh or manually copy to .omp/agents/:

mkdir -p .omp/agents
cp agents-omp/oh-*.md .omp/agents/

When agents are installed, the phase-aware hook automatically switches from suggesting /skill commands to suggesting agent dispatch.

Learn More

License

MIT

About

Agent skills for the Open Horizons strategic execution framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors