Threat model: why coily exists

coily is a CLI security boundary for privileged ops, escape-hatch-resistant and with an audit trail.

That's the goal in one sentence. The rest of this document is the rationale behind each of those three properties and the design guardrails that preserve them.

Reporting a vulnerability

Found a hole in the boundary? That is exactly the thing this project most wants to hear about.

Public - open an issue on this repo.
Private - use GitHub's private vulnerability reporting on this repo (the Security tab, "Report a vulnerability").

coily is a personal project with a single maintainer, so there is no formal SLA. Confirmed findings get written up in the "Known boundary holes" section below.

coily does not accept external pull requests, including fixes for security issues - see CONTRIBUTING.md. Report the finding; the maintainer applies the fix. A security boundary that takes patches from outside the boundary is not one.

Framing

The unlock for lights-out engineering is not the cleverness of the agent. It is the clarity of the boundaries inside which the agent can work. Every wrapped tool validates argv unconditionally. Every denial names the recovery command. Every audit row carries a session id. The agent feels slower for two days, then suddenly the failure modes are all things humans can read and route, instead of opaque retries.

Mike Piccolo, "The Harness Is the Backend" (iii, 2026-04-28), puts the same idea from the opposite direction:

The model isn't the product. The infrastructure is.

coily is the infrastructure on the privileged-ops side of that line. The boundary is the product.

This document was written during a Claude Code session on 2026-04-21 after a real incident. A manually-edited ClusterSecretStore had drifted from source, silently broke ExternalSecret syncing for the entire cluster, and was only noticed when pods started hitting CreateContainerConfigError. The incident itself was benign drift, but it highlighted a broader question. What stops an AI agent (or an attacker via prompt injection against an AI agent) from doing something genuinely destructive?

Threat model

The agent in question is Claude Code running locally on Kai's laptop, acting against:

The k3s cluster on kai-server (via kubectl over tailscale)
AWS (via aws CLI, credentials in ~/.aws/)
Tailscale, GitHub, and whatever else shells out through gh/git/ssh

The agent is not assumed hostile. It is assumed to be potentially jailbreakable via prompt injection. Any file it reads, any tool output it processes, any issue body it fetches could contain instructions that steer it into destructive actions. The defensive goal is not to stop a dedicated operator who has direct shell access. It is to bound the blast radius of a confused or prompt-injected agent so that the worst realistic outcome is "session got stuck" rather than "namespace got deleted."

Why Claude Code's built-in permissions aren't enough

Claude Code supports allow / deny rules on tool invocations, matched by command prefix. Kai's user settings currently deny kubectl delete:*, kubectl apply:*, and the rest of the write-family kubectl verbs. See ~/.claude/settings.json. This is a useful guardrail but not a moat, for four reasons.

1. Prefix matching is easy to smuggle past

A deny: "Bash(kubectl delete:*)" rule only matches commands whose Bash invocation starts with kubectl delete. All of the following sneak past:

sh -c "kubectl delete pod foo"
bash -c "kubectl delete pod foo"
echo "kubectl delete pod foo" | sh
sh <<'EOF'
kubectl delete pod foo
EOF
env kubectl delete pod foo
xargs -I{} sh -c "{}" <<< "kubectl delete pod foo"
find . -maxdepth 0 -exec kubectl delete pod foo \;
python -c "import subprocess; subprocess.run(['kubectl','delete','pod','foo'])"
ruby -e "%x(kubectl delete pod foo)"
perl -e 'system("kubectl","delete","pod","foo")'
go run ./throwaway.go   # where throwaway.go shells out
make delete             # where the Makefile target shells out

Plus any Node or Deno one-liner that spawns a subprocess. Every one of these starts with a different prefix. Adding more deny rules chases symptoms. Denylists are structurally unwinnable against a flexible execution environment.

2. Edits have consequences

The Edit/Write tools aren't covered by bash permission rules at all. An agent that can write files can still cause mutation by:

Editing a file that cron / launchd / systemd / a file watcher executes
Dropping a post-receive or pre-commit git hook
Modifying a Makefile or deploy script that a human later runs
Writing into .claude/, CLAUDE.md, or a plugin source, steering future agent sessions
Editing coily itself, if it lives in a writable checkout

Permissions don't sandbox the consequences of edits, only the edits themselves.

3. MCP tools bypass Bash entirely

Loaded MCP servers expose tools that call remote APIs directly without going through the shell. In this environment today, the aws-eks MCP server exposes manage_k8s_resource, apply_yaml, and add_inline_policy, each capable of mutating a cluster without invoking kubectl. Any Bash(kubectl ...) deny rule is irrelevant to those tools. They need their own deniedMcpServers or mcp__aws-eks__* rules, and those have to be maintained separately as MCP servers come and go.

4. Subagents have independent permission evaluation

When the main agent calls the Agent tool, a new Claude process starts. It reads the same settings.json files, so the existing deny rules do apply to it. But:

Subagents have independent context windows. The user only sees the final summary. Individual tool calls aren't in the main transcript.
Subagents run against prompts written by the main agent, not directly by the user. A prompt injection in a file the main agent delegates to a subagent for analysis ("read /tmp/report.md and summarize") now runs with less human oversight than direct tool calls.
Background subagents can run while the user is away from the terminal.

Denylist rules still bound the damage at the Bash layer, but the agent has more attempts and less scrutiny.

The allowlist inversion

The structural fix is to invert the list. Instead of enumerating every way to run a dangerous primitive (impossible), enumerate the small set of safe operations and block everything else.

Deny broadly. Every execution-family CLI. The scripting interpreters (python, ruby, perl, deno, node), go run, the shells (sh, bash, zsh, plus Windows cmd / powershell / pwsh), Windows scripting hosts and LOLBAS binaries (wscript, cscript, mshta, rundll32, regsvr32), build runners (make, package manager scripts), and kubectl, helm, terraform, aws, gcloud, gh, docker wholesale. Also deny broad ssh except to named hosts. The non-Bash PowerShell tool that Claude Code exposes on Windows is denied wholesale, the same way MCP tools are, since it bypasses the Bash prefix matcher entirely.
Allow narrowly. coily *, plus read-only utilities (ls, grep, cat, git log/diff/status), plus editing within explicit directories. The previous design enumerated read-verb allows for aws, kubectl, and gh (~170 lines) because Claude Code's Bash(prefix:*) syntax cannot pattern-match aws * describe-*. The current design inverts that: bare invocation of those binaries is denied entirely, and every call (read or write) routes through coily ops <bin>. The audit log becomes the system-of-record for the privileged-op surface, and adding a new safe shape never requires editing the lockdown file.

Then coily becomes the kernel boundary. Because it's a Go binary the agent cannot edit at runtime (installed from a separate build, not run from source), and because it takes structured flag arguments rather than raw strings, it can:

Validate inputs against a whitelist of operations
Reject shell metacharacters ($, backticks, ;, &&, ||, |, >, <, $(, ${, \) in any string argument that might reach a shell
Append every invocation to an audit log (append-only, outside the working tree)
Re-check permissions against its own config file, loaded from a root-owned path the agent cannot modify

This isn't bulletproof. A dedicated attacker with shell access still has shell access. A bug in coily itself could widen the boundary. But it raises the cost of a successful prompt-injection attack from "know that kubectl delete is dangerous" to "know Kai's specific coily command surface and find a path through it." That's a meaningful shift in attacker capability required.

What about confirmation tokens?

An earlier draft of coily required a short-lived HMAC token (issued via coily auth issue, consumed by destructive verbs) on every mutating op. The idea was to force a human-initiated issuance step before any destructive work. That design was removed on 2026-04-24 because it added ritual without adding security:

The HMAC key lived under ~/.coily/, readable by the coily process user. An agent could forge tokens directly, never mind calling auth issue.
coily auth issue itself was on the allowlist (it had to be, so Kai could invoke it). An agent could call it and consume the output in the same session.
The token scope check ran inside the same binary the agent was invoking. No out-of-band signer, no separate process.

So the token was a fourth fence made of paper, sitting behind three real ones: the Claude Code deny list, the coily allowlist, and the audit log. Agents could self-authorize trivially, so what the token actually gated was "Kai remembers to type a command first", which is not a security property.

If a genuinely out-of-band confirmation gate is needed later - a yubikey touch, a phone push, a signer running on kai-server that the laptop-agent can't reach - that would have teeth. The previous HMAC design did not, and the ritual was friction without benefit. The allowlist, argv validation, audit log, and Claude Code deny rules are the real fences.

Design guardrails for coily

Principles to preserve as features get added.

Installed as a built binary, not invoked from a writable checkout. The binary lives somewhere root-owned (e.g. /usr/local/bin/coily on unix, C:\Program Files\coily\coily.exe on Windows - admin-write-only by default ACL, same boundary property as a root-owned unix path). The source checkout is just for development.
Dev builds use a distinct binary name (coily-dev). Produced only inside the source checkout via make dev. The agent's allowlist trusts coily, not coily-dev. The actual security value is narrow: ./bin/coily-dev is not on $PATH, so the agent never finds it anyway, and the workspace deny list already blocks the go run path that would invoke the dev source tree. The rename catches one specific footgun (Kai accidentally cp ./bin/coily-dev /usr/local/bin/coily) and gives make dev something to do that doesn't shadow the installed binary on $PATH. Production builds use -tags prod which compiles out any dev-mode conveniences.
Config defaults are baked into the Go binary, with optional overlays at ~/.coily/config.yaml (global) and ./.coily/config.yaml (per-repo). Earlier coily embedded config.yaml via //go:embed and pitched that as a security boundary; that claim did not survive scrutiny. An attacker with write to /etc/coily/ already has root and could replace the binary outright, so embed-vs-disk does not raise the bar. The user-writable overlays are already loaded for the parts of config that matter, so the "agent edits config" path is open for the values that actually shape behavior. Non-secret defaults (tailscale hostnames, audit rotation knobs) live in cli-guard/config/config.go. Secrets come from existing credential stores (~/.aws/ profile, ~/.kube/config), not from any config file.
Sub-tool binaries (aws, kubectl, gh, tailscale, etc.) are resolved via $PATH. An earlier version of coily pinned each by sha256 in an embedded manifest, fetched them on first use from a GitHub Release, and bypassed $PATH entirely. That machinery was removed: the threat it addressed (an attacker with write to a $PATH directory but not to $HOME, where the cache and the coily binary itself live) was a narrow slice that did not justify the release pipeline + manifest + per-tool refresh cadence. The actual safety boundary is argv validation (no shell metacharacters reach the subprocess), the audit log (every invocation is recorded), and the lockdown deny list (raw aws / kubectl / gh are denied at the Bash-tool layer so the agent can only reach them via coily). Binary authenticity below that is the host's problem - the same problem brew and apt and every other package manager have. ssh is still wired through cli-guard/ssh (golang.org/x/crypto/ssh, host keys verified against ~/.ssh/known_hosts) rather than the ssh binary, because the SDK is cheap and avoids the argv-to-remote-shell concern entirely.
Structured args only. No subcommand takes a free-form string that is later passed to a shell. If a shell-out is absolutely necessary, the Go code uses an explicit argv list, never a composed shell string.
Allowlist at the verb level. coily k8s restart <deployment> exists. coily k8s exec does not. If a new verb is needed, it's a code change in coily, reviewed, committed, built, installed. That review step is the human gate.
Append-only audit log outside the working tree (e.g. /var/log/coily/audit.jsonl), writable by the coily process user only. Every invocation logged with timestamp, argv, effective verb, exit code.
No coily shell / coily run escape hatch, ever. The moment one exists, the whole boundary collapses. The remote-shell story shifted in coily#187: the per-verb coily ssh systemctl / coily ssh deploy / etc. wrappers got deleted in step 8, replaced by a single free-form passthrough (coily ssh <alias> -- <coily args>). The boundary that holds is that the remote coily's lockdown gates the action, not the local wrapper. The local passthrough does no verb filtering. There is no coily ops kubectl exec pass-through either: the lockdown deny-list catches the exec form before coily sees it.
Repo verbs may opt out of the shell-metacharacter validator per verb. A .coily/coily.yaml command can declare allow_metacharacters: true, which skips policy.ValidateArg on the YAML argv tokens at load time and sets verb.Spec.SkipPolicy so user-supplied extras also bypass the check at invocation. The opt-in is declarative (lives in the committed yaml, visible in PR diffs), bounded per verb (no global flag), and the audit row stamps policy_skipped: true so forensics can see that the row ran under the relaxed policy. The validator stays correct by default for built-in verbs whose argv can reach a remote shell. User exec verbs invoke a binary directly via execve, so the metacharacter check was overcautious for that surface unless the wrapped binary itself shells out (in which case the repo author leaves the opt-in off). Runtime verification: TestBuildChildRepoCommand_AllowMetacharactersStampsAudit in cmd/coily/ops_repo_test.go drives both the bypass and the audit-row stamp. Scoping rationale: coilysiren/coily#283 / coilysiren/cli-guard#81.
.coily/coily.yaml repo verbs require the declaring file committed and the branch synced. A repo verb refuses when the declaring .coily/coily.yaml is among the dirty paths, when HEAD is detached, when the branch has no upstream, or when the branch is behind upstream (ahead is allowed). Without this gate, a repo verb's argv could be whatever a local edit said at that moment, and the audit row would not be reconstructable from git log alone. Working-tree dirt outside coily.yaml does not refuse - the committed coily.yaml plus HEAD still reconstruct the verb invocation - but the porcelain status is still captured on the audit row for forensics. The --audit-override-dirty flag bypasses the gate for genuine emergencies and tags the audit row with audit_override: true. Built-in verbs are unaffected: their behavior is baked into the homebrew-released binary and reproducible from the version trailer in the audit row. Runtime verification: TestSecurityClaim_repo_verbs_require_clean_tree in cmd/coily/security_claims_test.go drives the gate end-to-end. Scoping rationale: #211.
Every audit row binds to a real repo, no opt-out. The --commit-scope flag (or $COILY_COMMIT_SCOPE) is required and cannot be set to -, none, or off (ErrOptOutRejected). Default auto resolves to git rev-parse --show-toplevel of cwd. Verbs that genuinely have no repo to bind to set verb.Spec.SkipScope = true at the definition site, so the choice is auditable from source rather than papered over by a per-call flag. The provenance contract is what makes the Audit-log: commit trailers a chain of trust instead of a soft hint.

Open questions

How does coily distinguish "Kai at the keyboard" from "Claude in auto mode"? Probably it doesn't need to. Log review happens after the fact regardless, and the allowlist already bounds what either can do.
How do we handle the aws-eks MCP server and other direct-API MCP tools? Resolved 2026-04-21. Kai doesn't use aws-eks and is removing it from ~/.claude.json. A removal script is checked in at scripts/remove-aws-eks-mcp.sh. Long-term, expose the operations Kai actually needs through coily and keep the MCP removed.
Interaction with subagents. If coily logs include the session ID, we can at least correlate destructive invocations back to specific agent runs for forensics.
The Agent permission rule is currently in allow. That's fine. Subagents inherit the same deny list, so allowing the tool itself doesn't widen the attack surface. Worth revisiting if specific subagents turn out to bypass rules.
An out-of-band confirmation gate for destructive ops (yubikey, phone push, remote signer on kai-server) would be a real fence rather than the ritual the old HMAC design provided. Not built; revisit if the allowlist surface grows beyond what's comfortable.

TODO: adversarially-reviewed CI self-update (v2)

The v1 distribution story is "laptop builds, scps to kai-server, sudo installs." That's safe but a big usability hit. The biggest usability hit of the whole design. The v2 plan brings back coily self-update without reopening the trust boundary.

Shape:

Write new verb in the coily repo on the laptop. Commit, push.
Adversarial review at the commit gate before anything merges to main. A second-opinion agent (Gemini via gemini CLI) reads the diff cold, with no access to the writer's context window, and either approves or blocks. Only approved diffs flow to CI. The reviewer has no ability to modify code, only approve or reject. A prompt injection that steered the writer does not steer the reviewer because the reviewer's context is independent.
CI (GitHub Actions) builds + tests + cross-compiles (darwin-arm64, linux-arm64). CI signs the binaries with a private key held only in GitHub Secrets.
Signed binaries published to a known location (GH Releases or a private S3 bucket).
coily self-update fetches latest, verifies the signature against a public key compiled into the running binary, and swaps itself in. Install fails if the signature doesn't verify. Cosign is the usual tool for this.

The adversarial reviewer is the piece that actually defends against prompt injection. Binary signing closes "malicious binary at the distribution URL" even if the storage bucket is compromised, but does nothing against "bad diff written and approved by the same writer." The two mechanisms compose.

Status as of 2026-05-19: Kai wants this but is not setting up gemini today. Revisit once the v1 command surface is useful day-to-day.

Reference: current Claude Code deny rules

The canonical list lives in cli-guard/lockdown/defaults.yaml and is embedded into the binary at build time. coily lockdown --apply writes it into a repo's .claude/settings.json.

Allow (top of the list): coily *, plus the read-only utilities (ls, cat, head, tail, wc, file, stat, grep, rg, find, tree), plus the read-only git shapes (status, log, diff, show, blame, branch, remote, ls-files, rev-parse, config --get).

Deny (privileged-op binaries, wholesale): aws, kubectl, gh, docker, tailscale, helm, terraform, gcloud, tflint, tfsec, plus broad ssh / scp / rsync. Every call routes through coily ops <bin> (for aws / kubectl / gh) or the matching coily verb. Plus the execution-family blocks (scripting interpreters, shells, Windows scripting hosts, build runners, package managers, the non-Bash PowerShell tool).

The first fence is the ~/.claude/settings.json deny list. coily is the second fence: argv validation + audit + (for kubectl) context routing. The goal is defense in depth, not either/or.

Known boundary holes

Real findings that bound where the boundary actually enforces. New entries land here when the prose-vs-runtime gap is discovered.

The user-level coily-binary-gate hook fires for every Bash PreToolUse event, including cron-spawned local-agent sessions. The hook entry written by coily setup registers an unconditional matcher: "Bash" PreToolUse hook with no transcript_path or time-of-day skip. The companion UserPromptSubmit hook bypass (added 2026-05-08 to unblock cron-fired routines) does not extend to this gate by design. Runtime verification: TestSecurityClaim_UserBinaryGateUnconditional in cmd/coily/security_claims_test.go asserts the settings.json entry has no conditional skip and that RenderUserHookScript itself does not early-return on a transcript_path shape. Issue #66.
Claude Desktop on Windows does not enforce the Bash deny list. Verified 2026-04-23 on Claude Code v2.1.119. Identical repo and .claude/settings.json produce different behavior depending on host. The CLI (claude in Git Bash) honors Bash(python:*) denies; Claude Code inside Claude Desktop (MSIX-packaged agent mode) shows the deny rule loaded under /permissions but runs the Bash tool without consulting it. PowerShell denies still fire in both hosts because they use a different matcher. Operational implication: lockdown is CLI-only enforcement for Bash rules. Agent sessions running from Claude Desktop on Windows effectively run with Bash permissions wide open. Prefer the CLI for any agent work that depends on lockdown for safety.

Anti-signals

Phrases that survived previous design rounds because nobody tested them. Codified here so the next round is faster.

"It's a security boundary because it's plumbed through the gate" is false unless the gate itself is verified. Plumbing-through is a property of users of the gate, not a property of the gate. A new feature that calls policy.ValidateArg does not become part of the boundary just by virtue of the call.
"X is an off-host shadow" is false unless X carries the full record. A summary stream (verb + counts + exit code) is a detection signal, not a shadow. The two have different forensic properties: a shadow lets you reconstruct what happened, a detection signal lets you know that something happened. Don't conflate them in prose.
"Drop the feature, then build the replacement" inverts the right order. If the dropped feature was on the boundary, the boundary degrades during the gap. Build (or accept the loss of) the replacement first, then drop. Otherwise the security claim regresses for every day of the gap.
Doc claims must match runtime artifacts. The security prose in this file describes runtime properties. When prose and runtime drift (a feature shipping less than the prose says), the boundary description silently overstates what's enforced. The TestSecurityClaims test in test/security_claims_test.go walks each load-bearing claim in this file and asserts it against the actual runtime, so prose-runtime drift surfaces as a test failure rather than as a forensic surprise.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threat model: why coily exists

Reporting a vulnerability

Framing

Threat model

Why Claude Code's built-in permissions aren't enough

1. Prefix matching is easy to smuggle past

2. Edits have consequences

3. MCP tools bypass Bash entirely

4. Subagents have independent permission evaluation

The allowlist inversion

What about confirmation tokens?

Design guardrails for coily

Open questions

TODO: adversarially-reviewed CI self-update (v2)

Reference: current Claude Code deny rules

Known boundary holes

Anti-signals

FilesExpand file tree

SECURITY.md

Latest commit

History

SECURITY.md

File metadata and controls

Threat model: why coily exists

Reporting a vulnerability

Framing

Threat model

Why Claude Code's built-in permissions aren't enough

1. Prefix matching is easy to smuggle past

2. Edits have consequences

3. MCP tools bypass Bash entirely

4. Subagents have independent permission evaluation

The allowlist inversion

What about confirmation tokens?

Design guardrails for coily

Open questions

TODO: adversarially-reviewed CI self-update (v2)

Reference: current Claude Code deny rules

Known boundary holes

Anti-signals