Skip to content

refactor(mediation): remove keychain_access flag in favour of explicit fs_read_file paths#39

Open
christine-at-datadog wants to merge 19 commits into
kipz:developfrom
christine-at-datadog:christine.le/remove-keychain-access-flag
Open

refactor(mediation): remove keychain_access flag in favour of explicit fs_read_file paths#39
christine-at-datadog wants to merge 19 commits into
kipz:developfrom
christine-at-datadog:christine.le/remove-keychain-access-flag

Conversation

@christine-at-datadog

@christine-at-datadog christine-at-datadog commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

Removes the keychain_access: bool shorthand from CommandSandbox.

It was added to avoid repeating the Keychain DB paths in every profile, but having them explicitly in fs_read_file is easier to read alongside the rest of the allow list — and the behaviour is identical. has_explicit_keychain_db_access() in sandbox/macos.rs already detects the paths from the capability set directly, so no Seatbelt logic changes.

Profiles using keychain_access: true should switch to:

"fs_read_file": [
  "~/Library/Keychains/login.keychain-db",
  "~/Library/Keychains/metadata.keychain-db"
]

Related

Companion profile update: DataDog/shadowfax#223

kipz and others added 19 commits May 13, 2026 15:22
Introduces a mediation framework that intercepts command execution
within sandboxed sessions, requiring admin approval for sensitive
operations. Includes audit trail logging, Unix socket control server,
and per-command sandbox shims.

Signed-off-by: James Carnegie <me@kipz.org>
Adds a macOS menu bar application that discovers active nono sessions
and provides a UI for reviewing and approving/denying mediated commands.

Signed-off-by: James Carnegie <me@kipz.org>
Signed-off-by: James Carnegie <me@kipz.org>
Commands like `gh` authenticate via macOS Keychain using mach-lookup IPC
to securityd. The per-command Seatbelt sandbox blocks these mach-lookups
by default, causing 401 auth failures when gh tries to retrieve its
stored GitHub token.

Add `keychain_access: bool` to CommandSandbox. When true, the per-command
sandbox grants read access to keychain DB files (login.keychain-db,
metadata.keychain-db), which triggers the existing mach-lookup deny skip
in the Seatbelt profile generator. This allows commands to authenticate
via keychain while keeping all other sandbox restrictions (filesystem,
network allowed_hosts) intact.

Also documents that the Approve action intentionally runs without a
per-command sandbox (None) because Seatbelt blocks keychain mach-lookup
even with keychain file grants insufficient for all auth paths.

Signed-off-by: Christine Le <christine.le@datadoghq.com>

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
After the explicit pass, `profile::expand_vars` now resolves any remaining
`$VAR` / `${VAR}` tokens (uppercase+underscore+digit names) from the
process environment. Unset vars are left literal, matching the existing
`$XDG_RUNTIME_DIR` fallback so downstream `add_sandbox_*` helpers log
"does not exist, skipping" rather than failing the session.

Also switches per-command mediation sandbox paths from `expand_home`
to `expand_vars` so `$WORKDIR`, `$HOME`, XDG vars, and any launch-time
env var (e.g. a caller-provided `$GIT_ROOT`) resolve consistently in
both the main and per-command sandboxes. `SessionCtx` gains a `workdir`
field threaded from `execution_runtime` through `session::setup` and
`server::run` so per-command expansion uses the same workdir as the
rest of nono.
`expand_vars` already resolved $VAR / ${VAR} tokens in top-level
filesystem paths, policy.* paths, command_args, and per-command
sandbox paths. Extend the same expansion to the remaining
user-authored string fields in the profile:

- `mediation.commands[].intercept[].args_prefix` — the intercept
  matcher compared each entry literally against the incoming argv,
  so `"$USER"` used to be a dead string. Expanding at profile-load
  time lets authors write session-aware matchers (e.g. the macOS
  Keychain `security find-generic-password <user> ...` rule) without
  install-time `sed` substitutions over the profile file.
- `mediation.commands[].binary_path` — consumed verbatim by `PathBuf::from`,
  now expanded so profiles can point at user-specific binaries like
  `$HOME/.local/bin/tool`.
- `network.custom_credentials[].{tls_ca,tls_client_cert,tls_client_key}` —
  previously used the narrower `policy::expand_path` which only handled
  `~`, `$HOME`, `$TMPDIR`. Switched to the full `expand_str` so any
  configured env var (e.g. `$XDG_CONFIG_HOME`) resolves.

Introduces `profile::expand_str` as the string-returning core of the
expansion pipeline. `expand_vars` now delegates to it. Threads the
session `workdir` through `resolve_credentials`, `build_proxy_config_from_flags`,
`start_proxy_runtime`, and `resolve_command` so `$WORKDIR` resolves
consistently across all expansion sites.
Stamps each audit.jsonl entry with session_id, session_name, nono_pid,
sandboxed_pid, and command_pid so operators can correlate commands across
a session and trace the full process hierarchy.

Process hierarchy per log entry:
  nono_pid       — the nono supervisor (unsandboxed parent)
  sandboxed_pid  — the direct child process nono sandboxed (e.g. claude, codex)
  command_pid    — the shim process that ran the specific command (e.g. echo, git)

The session_id/session_name are pre-generated in execution_runtime before
mediation setup so audit.jsonl and the session record share the same values.
sandboxed_pid is resolved after fork via an Arc<OnceLock<u32>> latch shared
between the mediation server and the on_fork callback.

ShimRequest gains a pid field so the mediated path can record command_pid.
The shim's AuditEvent also gains command_pid for the audit-only datagram path.
All new AuditEvent fields use #[serde(default)] for backward compatibility
with older shims that do not send them.

Signed-off-by: Christine Le <christine.le@datadoghq.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mediated passthrough previously buffered stdio: the shim read stdin with
a 50ms timeout into a UTF-8 String, the server spawned the real binary
with Stdio::piped() + wait_with_output(), and the parent's ChildStdin
dropped before the child could read. ssh/git over a binary pipe hit
SIGPIPE; every long-running mediated command (gh, kubectl, ...) silently
buffered output until exit.

The shim now sends its stdin/stdout/stderr fds via SCM_RIGHTS after the
JSON request. The server passes them straight to the real binary via
Stdio::from(...) for no-intercept passthrough (and the allow_commands
sub-branch and admin_passthrough), then wait()s. Capture/Respond/Approve
drop the passed fds and keep the existing buffered behaviour so they can
still inspect or transform the output.

The shim's stdin field is removed from ShimRequest (so was the
read_stdin_nonblocking helper); shim and server are versioned together,
no wire compatibility is needed.

Tests: added a streaming socketpair harness, a binary-roundtrip test
covering 0xFF bytes through stdin/stdout, and a Respond-path test that
verifies the dropped fds let the test side see EOF.
… command

Adds an optional CallerPolicy on each CommandEntry:

- agent_allowed: bool (default true) — whether the primary sandbox
  (no NONO_SANDBOX_CONTEXT) may invoke this command.
- allowed_parents: Option<Vec<String>> — restrict which mediated parents
  may invoke. None (field absent) accepts any parent (existing
  behaviour). Some([]) blocks every mediated parent. Some(["git"])
  permits only the listed parents.

The gate fires at the top of apply(), before the existing
allow_commands skip-intercepts branch, so a parent allowed by
allowed_parents still flows through unchanged. Rejected calls
return exit 126 with action_type "denied".

Defaults preserve full backward compatibility: a profile with no
caller_policy field on a command keeps current behaviour (agent +
any parent allowed).

Use case: ssh / ssh-keygen in shadowfax — set agent_allowed=false,
allowed_parents=["git"] so a malicious prompt cannot invoke them
directly to authenticate to attacker-controlled hosts (or sign
arbitrary data) using the user's keys via the per-command sandbox.
Signed-off-by: James Carnegie <me@kipz.org>
Sending three separate sendmsg(SCM_RIGHTS) calls for stdin/stdout/stderr
fails with EMSGSIZE (os error 40) on macOS when the socket receive buffer
already holds a large JSON request — as happens when git is invoked from
within gh's execution sandbox, which adds proxy and session env vars.

Replace the three individual sends with a single sendmsg carrying all
three fds in one SCM_RIGHTS control message, and update recv_three_fds
to receive them in a matching single recvmsg call.

This is the standard idiom for passing multiple fds and avoids the
macOS control-buffer constraint entirely.

Fixes: DataDog/shadowfax#95
The shim now sends its own cwd in `ShimRequest.cwd`, and the server
sets it on the spawned binary via `Command::current_dir`.

Without this, the spawned binary inherited the mediation server's
launch cwd. Tools that resolve config from cwd silently operated on
the wrong target — `git` in a worktree being the canonical case:
discovery would walk up from the server's cwd, find the wrong `.git`,
and report the wrong branch and toplevel.

The new field is `Option<String>` with `#[serde(default)]`:

- old shim → new server: missing field → legacy behaviour (server cwd)
- new shim → old server: extra field is ignored
- unreadable cwd: shim sends None → legacy behaviour
- non-directory cwd: server logs a warning and falls back to its own cwd

Adds `test_passthrough_uses_request_cwd`: drives `apply()` end-to-end
with a real `/bin/pwd` and asserts the spawned process prints the
caller's cwd, not the server's.
The mediation/* and nono-shim formatting drift accumulated across the 13-
commit mediation series. Folding the fmt fixes into the originating commits
caused conflicts because subsequent mediation commits re-touched the same
lines, so capture the cumulative fmt result here instead.

Signed-off-by: James Carnegie <me@kipz.org>
When nono creates an audit shim for a binary at session start, the shim
later runs `resolve_real_binary` which re-walks PATH to find the real
target. Intermediate shells (e.g. husky pre-commit hooks, lint-staged
workers) often munge PATH between session start and shim invocation,
stripping user toolchain dirs that contained the real binary. The walk
then returns nothing and the shim reports `nono-shim: <name>: command
not found` even though the real binary is still on disk at the path
nono saw at session start.

This change makes audit shim resolution deterministic by recording the
resolved absolute path at session start and consulting it first at exec
time:

- A new sibling dir `<session_dir>/shim-sources/` holds one sidecar
  per shim — `<name>` containing the absolute path of the binary
  the shim was created for. Both mediated commands and universal
  audit shims write a sidecar.
- `SessionHandle` exposes `shim_sources_dir` and the path is forwarded
  to mediated subprocesses via a new `NONO_SHIM_SOURCES_DIR` env var
  (alongside `NONO_SHIM_DIR`). The session-level dir is reused for
  filtered per-command sandboxes — sidecars are created once and
  shared, since the recorded paths do not change.
- `nono-shim::resolve_real_binary` now consults
  `NONO_SHIM_SOURCES_DIR/<name>` first and only falls back to the
  existing PATH walk if the sidecar is missing or its recorded path
  is no longer an executable file.

Tests:
- 8 new unit tests in nono-shim cover sidecar hits, missing dirs,
  trimming, deleted/non-executable targets, and the PATH-walk
  fallback.
- 2 new unit tests in mediation::session cover sidecar writes and
  overwrites.
- The existing `test_allow_commands_sets_nono_shim_dir_to_filtered_dir`
  test is extended to assert `NONO_SHIM_SOURCES_DIR` is forwarded
  unchanged into the per-command sandbox env.
…main

When a profile sets `network.allow_domain`, nono switches to
`NetworkMode::ProxyOnly`, which emits `(deny network*)` plus narrow
exceptions for the proxy port and DNS. Seatbelt classifies AF_UNIX
`connect(2)` as `network-outbound`, so the base deny blocks the
audit/mediation shim's connect to `<session_dir>/{mediation,control,
audit}.sock`.

`emit_unix_socket_rules` only emits exceptions for explicit
`UnixSocketCapability` entries; the `FsCapability` directory grant
the runtime adds for `handle.session_dir` covers file ops but not
socket connect. Under `NetworkMode::AllowAll` (default with no
`allow_domain`) the bug was masked by `(allow network-outbound)`.

Add a directory-scoped `UnixSocketCapability::new_dir(session_dir,
Connect)` alongside the existing FsCapability so the regex carve-out
covers all three direct-child sockets without leaking access deeper
under the session dir. Existing rule-emission tests in sandbox/macos.rs
already cover the directory-grant + ProxyOnly path
(test_generate_profile_unix_socket_dir_emits_non_recursive_regex).

Fixes kipz#33.
Replace whole-replacement of `mediation` in merge_profiles with a
per-field merge. The legacy behaviour fully replaced base mediation
when child declared any mediation block, which silently dropped every
mediated command the base set up — a footgun when downstream profiles
add a single command via `extends`.

Merge rules:
- commands: keyed by `name`. Same-name collisions get per-field merge
  (binary_path child-wins, intercept rules dedup by args_prefix with
  child first under first-match-wins, sandbox recursive-merges,
  caller_policy applies restrictive-wins). Distinct names append: base
  first, then names new to the child.
- env.block: dedup_append.
- CommandSandbox: network.block and keychain_access OR; allowed_hosts
  and fs_*/allow_commands union via dedup_append.
- CallerPolicy.agent_allowed: AND (managed deny survives a permissive
  child). allowed_parents: None==any, Some==restriction, intersection
  of two Somes preserves the strictest.

Adds 14 unit tests in mediation::merge::tests covering each rule
(name append/collide, intercept dedup, AND/intersection,
network.block OR, allowed_hosts/fs union, env block union,
empty-base/empty-child, is_active invariant).
Previously, nonce promotion fired only when an exec'd mediated command's
argv element or sandbox-env value *began* with `nono_`. That breaks any
caller that embeds the nonce inside a wider string — the canonical case
being an HTTP header built up by a shell script:

    token=$(some-mediated-cmd auth)
    curl -H "X-Token: $token" https://example/

Here `-H` is followed by `X-Token: nono_<64-hex>`, which fails the
`starts_with("nono_")` check, so curl receives the literal nonce instead
of the real token. The only workaround was to demand callers separate
the credential from its envelope ("export TOKEN=…; curl -H \"X-Token: \$TOKEN\""
won't help — curl's argv still contains the prefixed nonce).

This change replaces the prefix check with a substring rewrite. Any
`nono_` followed by 64 lowercase hex characters is resolved against the
broker and substituted in place. Unmatched (unknown) nonces are left
verbatim, matching the existing argv behaviour and avoiding a probing
oracle. The same rewrite is applied inside env values (e.g.
`AUTH_HEADER="Authorization: Bearer nono_…"`).

Behaviour notes:
- The prefix `nono_` is ASCII, so the byte-level scan never lands
  inside a multi-byte UTF-8 character.
- 64 lowercase hex chars is the exact format `TokenBroker::issue` emits;
  uppercase hex and shorter sequences are deliberately not matched so
  noisy input strings cannot accidentally trigger a resolve.
- One existing env test (`test_build_exec_env_discards_unknown_nonce`)
  encoded the old "discard the whole var on unknown nonce" behaviour;
  it is renamed and asserts the new "leave the literal text in place"
  contract instead, matching the argv path.

Added unit tests cover: pure-nonce arg, embedded nonce, multiple nonces
in one arg, malformed `nono_` prefix without 64 hex chars, malformed
prefix with uppercase hex, unknown but well-formed nonce, empty input,
no-match passthrough, and the env-value substring path including the
`DANGEROUS_ENV_VAR_NAMES` block.
…t fs_read_file paths

The `keychain_access: bool` shorthand on `CommandSandbox` granted read
access to `~/Library/Keychains/login.keychain-db` and
`~/Library/Keychains/metadata.keychain-db`, which caused the Seatbelt
profile generator to skip its mach-lookup denies for security daemons
(SecurityServer, securityd, keychaind, secd).

The detection in `has_explicit_keychain_db_access()` (sandbox/macos.rs)
already works by inspecting the capability set for those file paths —
it does not consult the `keychain_access` flag. Listing the paths
directly in `fs_read_file` produces identical Seatbelt output, making
the grant explicit and auditable in the profile JSON.

Remove the field, its policy.rs application block, the merge.rs OR rule,
and all associated tests. Profiles that need Keychain access now list
the paths in `fs_read_file` instead.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@christine-at-datadog christine-at-datadog force-pushed the christine.le/remove-keychain-access-flag branch from ce4fc2d to cde2929 Compare May 14, 2026 04:00
@christine-at-datadog christine-at-datadog marked this pull request as ready for review May 14, 2026 04:02
@christine-at-datadog christine-at-datadog requested a review from kipz May 14, 2026 04:02
@kipz kipz force-pushed the develop branch 4 times, most recently from 86b464a to 4f89e43 Compare June 4, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants