feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses) by drewmchugh · Pull Request #26 · kipz/nono

drewmchugh · 2026-04-30T14:58:24Z

Summary

Replaces the seccomp-unotify exec filter from #20 with a BPF-LSM mediation filter that moves the mediation decision into the kernel (closing the TOCTOU race a sibling thread sharing the trapped task's memory could exploit), and adds a file_open LSM hook so the agent cannot read the mediated binary's bytes at all — closing copy-the-binary, ld-linux trick, unprivileged-tmpfs, and shellcode bypasses in addition to the original direct-path exec bypass #20 was scoped to.

This is a fork-and-replace of #20 rather than a follow-on commit: the seccomp-unotify filter, its supervisor handler, and the shebang-chain walker are deleted; BPF-LSM is the sole enforcement path for mediated commands. Sessions where mediation is active hard-fail at startup if BPF-LSM isn't installable, surfacing the specific reason (missing kernel cmdline / caps).

Update (commits a5b6a85, 00063f4, 378f12c). The same BPF-LSM infrastructure now also enforces filesystem subtree deny-within-allow on Linux — matching macOS Seatbelt's (deny file-write* (subpath ...)) semantics for policy.add_deny_access and the required policy-group denies. The kernel side adds eight inode_* LSM hooks plus a dentry parent-chain walk on file_open; the userspace side routes the existing policy.add_deny_access and required policy-group denies into a new protected_roots BPF map. See the new Filesystem subtree deny section below.

How the kernel side decides

Two BPF-LSM hooks share a single (dev, ino) deny map and a cgroup-ancestor scope check:

bprm_check_security — fires after the kernel has resolved the binary the call will actually load (bprm->file); a -EACCES return atomically aborts the exec syscall, with no race against any user-memory pointer the agent controlled.
file_open — fires inside do_filp_open for every successful path resolution that yields a file descriptor; denying opens of mediated inodes prevents the agent from reading the binary's bytes through any path (cp / cat / mmap / splice / dynamic linker).

Per-session scoping by cgroup ancestry: the broker creates a per-session cgroup at session start, the agent joins it as its first post-fork action, and the BPF program walks the calling task's cgroup ancestor chain looking for the session cgroup id. Cgroup membership inherits on fork() and is unaffected by reparenting, so daemonized agent descendants stay scoped. Ancestry (not equality) closes a sub-cgroup-escape where an agent with CAP_SYS_ADMIN could mkdir a child cgroup of the session and move into it.

Bypass classes closed

#	Bypass	Closed by
1	Direct path: `/usr/bin/gh`	`bprm_check_security` denies the exec
2	TOCTOU on user-memory paths (incl. vfork-bomb)	decision is in-kernel against `bprm->file`, not user-memory pointers
3	Copy the binary then exec the copy	`file_open` denies the read of mediated bytes
4	Indirect load: `/lib/ld-linux-x86-64.so.2 /usr/bin/gh`	`file_open` denies the linker's `open()`
5	Unprivileged tmpfs + copy + exec	`file_open` denies the read at the source
6	Shellcode (read bytes, mmap PROT_EXEC, jump)	`file_open` denies the source read
7	Hardlink at non-deny-set path	identity is `(dev, ino)`, hardlinks share the inode
8	Daemonize / reparent to break parent-chain identity	scope is cgroup membership, not parent chain
9	Sub-cgroup escape (privileged agent only)	scope check walks ancestor chain
10	Read of a path inside a protected subtree	extended `file_open` walks dentry parents against `protected_roots`
11	Write to a file inside a protected subtree	same `file_open` path (open-for-write)
12	mmap / splice of a protected file	same `file_open` (mmap setup opens the file)
13	unlink / rmdir / rename of a protected child	new `inode_*` hooks return `-EACCES`
14	Create file/dir/symlink/link inside protected	`inode_create` / `inode_mkdir` / `inode_symlink` / `inode_link`
15	chmod / chown / truncate / utimes on protected	`inode_setattr`
16	Access via a bind mount of a non-protected source over a protected target	userspace populator inserts both directly-listed inodes AND bind-mount source roots into `protected_roots`

The vfork-bomb residual #20 left open empirically went from 49/600 bypasses (under the seccomp-unotify filter on this branch with cgroup scoping but no BPF) to 0/600 under BPF-LSM. The pthread variant remains 0/600.

Deployment requirements

Requirement	How
Active LSM stack	`lsm=...,bpf` in kernel cmdline. The host kernel needs `CONFIG_BPF_LSM=y`; verify via `grep bpf_lsm_bprm_check_security /proc/kallsyms`.
Broker capabilities	`setcap cap_bpf,cap_sys_admin,cap_dac_override+ep /usr/bin/nono`. CAP_BPF for the BPF program load; CAP_SYS_ADMIN for the cgroup-namespace privilege check on `mkdir`; CAP_DAC_OVERRIDE because cgroup v2 `mkdir` runs the VFS DAC check before the cgroup-namespace privilege check, and CAP_SYS_ADMIN does not subsume DAC.
Agent invariants	`PR_SET_NO_NEW_PRIVS=1` set before agent execve (existing nono behaviour) so file caps don't apply. The broker also verifies `CapEff: 0` on the agent post-execve and emits an explicit invariants log line at session start.

If any requirement is unmet, the broker fails session start with an explicit error pointing at the specific fix (kernel cmdline / caps / config). There is no silent partial enforcement.

Filesystem subtree deny (extension)

The first BPF-LSM hook pair gates binaries by inode. The same infrastructure now extends to gating filesystem subtrees — the same problem macOS Seatbelt solves with (deny file-write* (subpath ...)) rules.

Why this exists

Three pre-existing Linux gaps motivated this:

validate_caps_against_protected_roots hard-rejected on Linux when a profile granted a parent of ~/.nono (e.g. --workdir=$HOME), because Landlock can't express deny-within-allow. macOS handled the same case via Seatbelt deny rules. The user-visible symptom (Jira WRK-2585): claude from $HOME on a workspace fails sandbox init with nono: Sandbox initialization failed: Refusing to grant '/home/bits' (source: Profile) because it overlaps protected nono state root '/home/bits/.nono'.
policy.add_deny_access (crates/nono-cli/src/profile/mod.rs:98) was a no-op on Linux — add_deny_access_rules short-circuited at if cfg!(target_os = "macos"). shadowfax's Linux profile declares add_deny_access: ["$HOME/.config/password-store/gpg"] and the deny was silently dropped.
validate_deny_overlaps rejected any session whose allow set covered a required policy-group deny (deny_credentials, deny_keychains_linux, …), forcing Linux profiles to enumerate their allows narrowly enough to dodge the overlap.

All three are now resolved by routing the deny-within-allow enforcement through BPF-LSM at the kernel level.

Kernel side (commit `a5b6a85`)

A new protected_roots BPF_MAP_TYPE_HASH (capacity 64, keyed by (dev, ino)). dentry_in_protected_subtree walks up to MAX_DENTRY_DEPTH=16 levels via BPF_CORE_READ on d_parent, verifier-friendly with the same #pragma unroll shape as the existing cgroup walker. check_file_open extended to also consult the new map; eight new SEC blocks — inode_unlink, inode_rmdir, inode_rename, inode_create, inode_mkdir, inode_symlink, inode_link, inode_setattr — cover the structural mutations that don't go through file_open. Two new audit reason codes (protected_open_deny, protected_mutate_deny) carry through the existing ringbuf and JSONL pipeline.

bpf_d_path was considered and rejected: it returns mount-namespace-aware paths (the wrong granularity for the bind-mount case below), and several of the new hooks operate on dentry directly with no struct path to hand to it.

Userspace side (commit `00063f4`)

protected_paths.rs: target_os = "macos" gate dropped from validate_requested_path_against_protected_roots. With allow_parent_of_protected: true set on the profile, both platforms admit the parent grant; the OS sandbox layer enforces the subtree at runtime (Seatbelt rules on macOS, BPF-LSM hooks on Linux). New bpf_lsm_protected_roots_for_session(state_roots, add_deny_access, policy_group_denies) merges all three deny sources, runs add_deny_access strings through policy::expand_path so $HOME/... and ~/... placeholders become concrete paths the populator can stat (without the expansion the literal string would never make it into the kernel map — bug caught during end-to-end validation).
policy.rs: validate_deny_overlaps no-ops on Linux when BPF-LSM is available — the kernel hooks return -EACCES for any in-subtree access regardless of what Landlock allows, so the silent-drop motivation no longer holds. Hosts without BPF-LSM (no lsm=...,bpf in cmdline, no cap_bpf) still hit the existing rejection.
bpf_lsm.rs: install_mediation_filter takes a protected_paths: &[PathBuf] slice. The populator stats each canonical path, inserts (dev, ino), then parses /proc/self/mountinfo to enumerate bind-mount source roots mounted at-or-under any protected path — needed because the dentry parent walk follows the source filesystem tree, not the mount tree (e.g. shadowfax's ~/.nono/sessions bind-mount). Both the directly-listed inode and the bind-mount source inode go into the same map; the BPF walker doesn't care which.
exec_strategy.rs: BPF-LSM now installs when EITHER the exec deny set OR the protected_roots set is non-empty (a profile with no mediation.commands but with add_deny_access or any policy-group deny still gets enforcement).
Plumbing: protected_paths flows through PreparedSandbox → ExecutionFlags → SupervisedRuntimeContext → SupervisorConfig → install_mediation_filter. Audit log dir falls back to ~/.nono/sessions when only protected_paths is active (no exec mediation session).
12 new integration tests in crates/nono-cli/tests/bpf_lsm_protected_subtree.rs covering read / write / mmap / unlink / rmdir / rename / create / bind-mount / regression / ~/.nono / add_deny_access / opt-in session start.

Documentation (commit `378f12c`)

docs/linux-bpf-lsm-mediation.md: new ## Filesystem subtree deny section between Performance and Deployment requirements, with hook coverage table, identity model, bind-mount handling, audit, performance numbers, and profile schema. Summary section linked.
qa-profiles/04-allow-parent-of-protected.json: sample profile granting $HOME with the opt-in plus add_deny_access for $HOME/.config/qa-secret. Used for manual validation.

Performance (microbench)

200-iteration cat /etc/passwd inside the sandbox: ~1879 ms (release nono).
Same loop native: ~254 ms.
Subtracting ~1.5 s nono one-time setup: ~125 µs added per cat. Each cat does ~50 file opens (loader resolving libs, /etc/passwd itself), so ~2-3 µs per BPF-checked file_open in the busy path.
5-run median of claude -p hello: native 1.85 s, sandboxed 4.6 s. The 2.75 s overhead is dominated by the existing nono setup costs that PR feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses) #26 already incurred; the new BPF code is in the noise.
Imperceptible for interactive agent workloads. Potentially measurable on tight build loops (e.g. make -j with thousands of forks per second), where the existing nono machinery already dominates the budget.

What changed vs #20

Deleted:

install_seccomp_exec_filter and its BPF program builder from sandbox/linux.rs
handle_exec_notification, classify_exec_path, ExecDecision, read_path_at, read_execve_argv_at, count_threads from supervisor_linux.rs
mediation/shebang.rs (the shebang-chain walker)
mediation/filter_audit.rs (the seccomp-era audit emitter — replaced by a library-side reader for the BPF ringbuf)
The seccomp poll branch and exec_notify_fd SCM_RIGHTS plumbing in exec_strategy.rs

Added (original BPF-LSM work):

crates/nono/src/bpf/mediation.bpf.c — two LSM programs sharing a deny map + scope check, plus a BPF_MAP_TYPE_RINGBUF for audit records
crates/nono/src/sandbox/bpf_lsm.rs — loader (install_mediation_filter, MediationFilterHandle, SessionCgroup)
crates/nono/src/sandbox/bpf_audit.rs — userspace ring-buffer reader that appends JSONL events to ~/.nono/sessions/audit.jsonl
crates/nono/tests/bpf_lsm_smoke.rs — load + attach unit tests
crates/nono-cli/tests/bpf_lsm_integration.rs — 16 end-to-end tests organised into exec mediation / read mediation / audit / composition groups, with self-skip if the test binary lacks the BPF caps
make test-integration — build → setcap → cargo-test target so the integration suite actually runs in CI

Added (filesystem subtree deny extension):

crates/nono/src/bpf/mediation.bpf.c extended — protected_roots map, dentry_in_protected_subtree, file_in_protected_subtree, extended check_file_open, eight new inode_* SEC blocks, two new audit reason codes
crates/nono/src/sandbox/bpf_lsm.rs extended — MAX_PROTECTED_ROOTS=64, eight new Link fields on MediationFilterHandle, attached_program_count(), protected_roots_map(), collect_protected_root_entries, bind_mount_sources_under, decode_mountinfo_octal, two new BpfLsmError variants
crates/nono-cli/src/protected_paths.rs extended — bpf_lsm_protected_roots_for_session, target_os = "macos" gate dropped
crates/nono-cli/src/policy.rs — validate_deny_overlaps no-ops on Linux when BPF-LSM is available
crates/nono-cli/src/sandbox_prepare.rs + the ExecutionFlags/SupervisedRuntimeContext/SupervisorConfig chain — thread protected_paths end-to-end
crates/nono-cli/tests/bpf_lsm_protected_subtree.rs — 12 new integration tests
qa-profiles/04-allow-parent-of-protected.json — sample profile

Schema change (audit.jsonl entries from BPF):

action_type: "allow_unmediated" / "deny" (no exec_filter_ prefix)
reason (only on deny): "exec_deny" / "open_deny" / "protected_open_deny" / "protected_mutate_deny"
interpreter_chain: dropped (kernel resolves shebangs internally; BPF fires for the actually-loaded binary)
exit_code: 126 on deny, absent on allow

The shim's own mediation::AuditEvent shape is unchanged; both event shapes interleave in the same JSONL file.

Validation

Performed end-to-end on the BPF-LSM workspace AMI (am/bpf-lsm-workspace-ami on dd-source — adds lsm=...,bpf to the kernel cmdline). Each check was re-run after every phase:

Documentation

Design doc: docs/linux-bpf-lsm-mediation.md
Implementation decisions log (autonomous-session picks, with rationale): docs/linux-exec-filter-bpf-lsm-impl-decisions.md
Iteration log (vfork residual + the userspace ptrace experiments that didn't land): docs/linux-exec-filter-vfork-decisions.md
Old seccomp-era plan: archived at docs/archive/linux-exec-filter-plan-seccomp-era.md

Test plan

Reviewer boots a workspace from am/bpf-lsm-workspace-ami and confirms cat /sys/kernel/security/lsm includes bpf
make build-release && sudo setcap cap_bpf,cap_sys_admin,cap_dac_override+ep target/release/nono
sudo -E cargo test -p nono --test bpf_lsm_smoke — expect 7 pass (5 original + 2 new)
make test-integration — expect 16 (original) + 12 (new bpf_lsm_protected_subtree) = 28 pass
Run a mediation-active session and confirm ~/.nono/sessions/audit.jsonl receives allow_unmediated and deny/open_deny entries with the expected schema
Run a session against qa-profiles/04-allow-parent-of-protected.json, verify protected_open_deny and protected_mutate_deny audit events fire when the agent reads/writes $HOME/.config/qa-secret
Run the kipz POCs (vfork + pthread, 600 attempts each); expect 0 bypasses

🤖 Generated with Claude Code

Introduces a mediation framework that intercepts command execution within sandboxed sessions, requiring admin approval for sensitive operations. Includes audit trail logging, Unix socket control server, and per-command sandbox shims. Signed-off-by: James Carnegie <me@kipz.org>

Adds a macOS menu bar application that discovers active nono sessions and provides a UI for reviewing and approving/denying mediated commands. Signed-off-by: James Carnegie <me@kipz.org>

Signed-off-by: James Carnegie <me@kipz.org>

Commands like `gh` authenticate via macOS Keychain using mach-lookup IPC to securityd. The per-command Seatbelt sandbox blocks these mach-lookups by default, causing 401 auth failures when gh tries to retrieve its stored GitHub token. Add `keychain_access: bool` to CommandSandbox. When true, the per-command sandbox grants read access to keychain DB files (login.keychain-db, metadata.keychain-db), which triggers the existing mach-lookup deny skip in the Seatbelt profile generator. This allows commands to authenticate via keychain while keeping all other sandbox restrictions (filesystem, network allowed_hosts) intact. Also documents that the Approve action intentionally runs without a per-command sandbox (None) because Seatbelt blocks keychain mach-lookup even with keychain file grants insufficient for all auth paths. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

After the explicit pass, `profile::expand_vars` now resolves any remaining `$VAR` / `${VAR}` tokens (uppercase+underscore+digit names) from the process environment. Unset vars are left literal, matching the existing `$XDG_RUNTIME_DIR` fallback so downstream `add_sandbox_*` helpers log "does not exist, skipping" rather than failing the session. Also switches per-command mediation sandbox paths from `expand_home` to `expand_vars` so `$WORKDIR`, `$HOME`, XDG vars, and any launch-time env var (e.g. a caller-provided `$GIT_ROOT`) resolve consistently in both the main and per-command sandboxes. `SessionCtx` gains a `workdir` field threaded from `execution_runtime` through `session::setup` and `server::run` so per-command expansion uses the same workdir as the rest of nono.

`expand_vars` already resolved $VAR / ${VAR} tokens in top-level filesystem paths, policy.* paths, command_args, and per-command sandbox paths. Extend the same expansion to the remaining user-authored string fields in the profile: - `mediation.commands[].intercept[].args_prefix` — the intercept matcher compared each entry literally against the incoming argv, so `"$USER"` used to be a dead string. Expanding at profile-load time lets authors write session-aware matchers (e.g. the macOS Keychain `security find-generic-password <user> ...` rule) without install-time `sed` substitutions over the profile file. - `mediation.commands[].binary_path` — consumed verbatim by `PathBuf::from`, now expanded so profiles can point at user-specific binaries like `$HOME/.local/bin/tool`. - `network.custom_credentials[].{tls_ca,tls_client_cert,tls_client_key}` — previously used the narrower `policy::expand_path` which only handled `~`, `$HOME`, `$TMPDIR`. Switched to the full `expand_str` so any configured env var (e.g. `$XDG_CONFIG_HOME`) resolves. Introduces `profile::expand_str` as the string-returning core of the expansion pipeline. `expand_vars` now delegates to it. Threads the session `workdir` through `resolve_credentials`, `build_proxy_config_from_flags`, `start_proxy_runtime`, and `resolve_command` so `$WORKDIR` resolves consistently across all expansion sites.

Stamps each audit.jsonl entry with session_id, session_name, nono_pid, sandboxed_pid, and command_pid so operators can correlate commands across a session and trace the full process hierarchy. Process hierarchy per log entry: nono_pid — the nono supervisor (unsandboxed parent) sandboxed_pid — the direct child process nono sandboxed (e.g. claude, codex) command_pid — the shim process that ran the specific command (e.g. echo, git) The session_id/session_name are pre-generated in execution_runtime before mediation setup so audit.jsonl and the session record share the same values. sandboxed_pid is resolved after fork via an Arc<OnceLock<u32>> latch shared between the mediation server and the on_fork callback. ShimRequest gains a pid field so the mediated path can record command_pid. The shim's AuditEvent also gains command_pid for the audit-only datagram path. All new AuditEvent fields use #[serde(default)] for backward compatibility with older shims that do not send them. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Mediated passthrough previously buffered stdio: the shim read stdin with a 50ms timeout into a UTF-8 String, the server spawned the real binary with Stdio::piped() + wait_with_output(), and the parent's ChildStdin dropped before the child could read. ssh/git over a binary pipe hit SIGPIPE; every long-running mediated command (gh, kubectl, ...) silently buffered output until exit. The shim now sends its stdin/stdout/stderr fds via SCM_RIGHTS after the JSON request. The server passes them straight to the real binary via Stdio::from(...) for no-intercept passthrough (and the allow_commands sub-branch and admin_passthrough), then wait()s. Capture/Respond/Approve drop the passed fds and keep the existing buffered behaviour so they can still inspect or transform the output. The shim's stdin field is removed from ShimRequest (so was the read_stdin_nonblocking helper); shim and server are versioned together, no wire compatibility is needed. Tests: added a streaming socketpair harness, a binary-roundtrip test covering 0xFF bytes through stdin/stdout, and a Respond-path test that verifies the dropped fds let the test side see EOF.

… command Adds an optional CallerPolicy on each CommandEntry: - agent_allowed: bool (default true) — whether the primary sandbox (no NONO_SANDBOX_CONTEXT) may invoke this command. - allowed_parents: Option<Vec<String>> — restrict which mediated parents may invoke. None (field absent) accepts any parent (existing behaviour). Some([]) blocks every mediated parent. Some(["git"]) permits only the listed parents. The gate fires at the top of apply(), before the existing allow_commands skip-intercepts branch, so a parent allowed by allowed_parents still flows through unchanged. Rejected calls return exit 126 with action_type "denied". Defaults preserve full backward compatibility: a profile with no caller_policy field on a command keeps current behaviour (agent + any parent allowed). Use case: ssh / ssh-keygen in shadowfax — set agent_allowed=false, allowed_parents=["git"] so a malicious prompt cannot invoke them directly to authenticate to attacker-controlled hosts (or sign arbitrary data) using the user's keys via the per-command sandbox.

Signed-off-by: James Carnegie <me@kipz.org>

Sending three separate sendmsg(SCM_RIGHTS) calls for stdin/stdout/stderr fails with EMSGSIZE (os error 40) on macOS when the socket receive buffer already holds a large JSON request — as happens when git is invoked from within gh's execution sandbox, which adds proxy and session env vars. Replace the three individual sends with a single sendmsg carrying all three fds in one SCM_RIGHTS control message, and update recv_three_fds to receive them in a matching single recvmsg call. This is the standard idiom for passing multiple fds and avoids the macOS control-buffer constraint entirely. Fixes: DataDog/shadowfax#95

The shim now sends its own cwd in `ShimRequest.cwd`, and the server sets it on the spawned binary via `Command::current_dir`. Without this, the spawned binary inherited the mediation server's launch cwd. Tools that resolve config from cwd silently operated on the wrong target — `git` in a worktree being the canonical case: discovery would walk up from the server's cwd, find the wrong `.git`, and report the wrong branch and toplevel. The new field is `Option<String>` with `#[serde(default)]`: - old shim → new server: missing field → legacy behaviour (server cwd) - new shim → old server: extra field is ignored - unreadable cwd: shim sends None → legacy behaviour - non-directory cwd: server logs a warning and falls back to its own cwd Adds `test_passthrough_uses_request_cwd`: drives `apply()` end-to-end with a real `/bin/pwd` and asserts the spawned process prints the caller's cwd, not the server's.

The mediation/* and nono-shim formatting drift accumulated across the 13- commit mediation series. Folding the fmt fixes into the originating commits caused conflicts because subsequent mediation commits re-touched the same lines, so capture the cumulative fmt result here instead. Signed-off-by: James Carnegie <me@kipz.org>

When nono creates an audit shim for a binary at session start, the shim later runs `resolve_real_binary` which re-walks PATH to find the real target. Intermediate shells (e.g. husky pre-commit hooks, lint-staged workers) often munge PATH between session start and shim invocation, stripping user toolchain dirs that contained the real binary. The walk then returns nothing and the shim reports `nono-shim: <name>: command not found` even though the real binary is still on disk at the path nono saw at session start. This change makes audit shim resolution deterministic by recording the resolved absolute path at session start and consulting it first at exec time: - A new sibling dir `<session_dir>/shim-sources/` holds one sidecar per shim — `<name>` containing the absolute path of the binary the shim was created for. Both mediated commands and universal audit shims write a sidecar. - `SessionHandle` exposes `shim_sources_dir` and the path is forwarded to mediated subprocesses via a new `NONO_SHIM_SOURCES_DIR` env var (alongside `NONO_SHIM_DIR`). The session-level dir is reused for filtered per-command sandboxes — sidecars are created once and shared, since the recorded paths do not change. - `nono-shim::resolve_real_binary` now consults `NONO_SHIM_SOURCES_DIR/<name>` first and only falls back to the existing PATH walk if the sidecar is missing or its recorded path is no longer an executable file. Tests: - 8 new unit tests in nono-shim cover sidecar hits, missing dirs, trimming, deleted/non-executable targets, and the PATH-walk fallback. - 2 new unit tests in mediation::session cover sidecar writes and overwrites. - The existing `test_allow_commands_sets_nono_shim_dir_to_filtered_dir` test is extended to assert `NONO_SHIM_SOURCES_DIR` is forwarded unchanged into the per-command sandbox env.

fix(mediation): record audit-shim source paths to survive PATH munging

…ediation foundation Transplants the BPF-LSM mediation and protected-subtree work from drewmchugh:am/linux-exec-filter-bpf-lsm onto kipz/develop (v0.47.0). Key changes: - Add crates/nono/src/bpf/ (mediation.bpf.c, vmlinux.h) and the libbpf-rs loader (bpf_lsm.rs) + audit ring-buffer reader (bpf_audit.rs) - Update nono/Cargo.toml + build.rs to compile the BPF program at build time - Update sandbox/mod.rs to expose bpf_lsm + bpf_audit modules - Add nono-cli bpf-lsm feature that propagates nono/bpf-lsm - Wire protected_paths field through PreparedSandbox → ExecutionFlags → SupervisedRuntimeContext → exec_strategy::SupervisorConfig - Add mediation_filter_state() helper in execution_runtime to extract deny_set + shim/audit dirs from SessionHandle - Fix mediation/policy.rs Sandbox::apply() return type mismatch (pre-existing develop bug: Linux returns SeccompNetFallback not ()) - Add bpf_lsm BPF-LSM skip in policy::validate_deny_overlaps - Add bpf_lsm_protected_roots_for_session() in protected_paths (reads filesystem.deny from new schema) - Add integration tests: bpf_lsm_integration, bpf_lsm_protected_subtree - Add smoke tests: bpf_lsm_smoke - Add qa-profiles/04-allow-parent-of-protected.json sample profile - Add docs/linux-bpf-lsm-mediation.md design doc Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>

…t wiring - Add protected_paths: Vec::new() to PreparedSandbox test initializers in main.rs - Fix clippy unwrap_used lint in mediation/mod.rs tests (unwrap → expect) - Fix clippy loop-index lint in mediation/server.rs (for i in 0..n → enumerate) - Run cargo fmt to fix whitespace in bpf_lsm.rs and mediation/policy.rs Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>

BPF audit records for non-mediated execs had empty command and path fields. The audit_record struct carried only (dev, ino); userspace resolved those to a path via inode_to_path, which is built from the mediation deny set. Non-deny-set binaries were not in that table so both fields were left empty. Fix: add char filename[256] to struct audit_record and populate it in check_exec via bpf_probe_read_kernel_str(bprm->filename). On the Rust side, decode the field as a PathBuf fallback in handle_record when the inode_to_path lookup returns nothing. The deny-set lookup still wins when present (canonical path). As a side effect, the shim-suppression check now correctly fires for shim-routed commands whose path_str was previously always empty. Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>

The Approve action previously dropped the shim's stdin/stdout/stderr fds and ran the real binary (or script) with null stdin and piped stdout/stderr buffered into ShimResponse. This worked for the original use cases — ddtool auth gitlab login, ddtool auth token X — because none of them read stdin and their stdout is just session-info prints that the shim relays to its caller's stdout transparently. It breaks for any binary that follows the Docker credential-helper protocol, where the caller writes the registry hostname to stdin and expects JSON creds back on stdout. docker-credential-ddtool launched through Approve was getting empty stdin and returning "no credentials server URL", which surfaces in Bazel's rules_oci_bootstrap as: failed to run credential helper, stdout: no credentials server URL This was first hit on Datadog Linux workspaces where the Docker auth flow is multi-process (docker-credential-ddtool -> pass -> gpg) and can't be cleanly held inside a per-command sandbox, forcing the choice of Approve to escape the sandbox — exposing the gap. Switch Approve to streaming mode: pass the shim's stdio fds into exec_passthrough/exec_script via Stdio::from(...), call wait() instead of wait_with_output(), return an empty ShimResponse stdout/ stderr (only exit code matters; the binary's output streamed directly to the caller). Buffered mode is preserved for Respond and Capture, which still need to inspect stdout (Capture issues a nonce in place of it). Behavior change for existing Approve usages: unchanged in practice. None of them read stdin, so connecting stdin doesn't break anything. They write to stdout, which previously was buffered into ShimResponse and relayed back; now it streams directly with the same end result for the user. The only observable delta is that nono's audit log entries for Approve actions no longer carry stdout/stderr content. Add stdio_fds parameter to exec_script. Capture's script path passes None (preserves buffered behaviour for nonce issuance); Approve's script path passes Some(stdio_fds) for streaming. Verified: bzl build of an internal Datadog target inside a sandboxed session now completes through the docker-credential-helper path. Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>

drewmchugh force-pushed the am/linux-exec-filter-bpf-lsm branch 2 times, most recently from eddfce2 to 1c0993f Compare April 30, 2026 16:38

kipz and others added 14 commits May 6, 2026 12:37

feat: add Swift menu bar app for nono privilege control

6bfb71e

Adds a macOS menu bar application that discovers active nono sessions and provides a UI for reviewing and approving/denying mediated commands. Signed-off-by: James Carnegie <me@kipz.org>

docs: document command mediation

ab9dbb9

Signed-off-by: James Carnegie <me@kipz.org>

ignore missing commands when resolving mediation policy

b8f40d7

chore: untrack Swift build artifacts and add to .gitignore

f288be2

Signed-off-by: James Carnegie <me@kipz.org>

drewmchugh mentioned this pull request May 6, 2026

feat(linux): seccomp exec filter to close the full-path mediation bypass #20

Closed

3 tasks

kipz force-pushed the develop branch from 422fed4 to a77431f Compare May 6, 2026 16:59

kipz and others added 4 commits May 6, 2026 19:24

Merge pull request kipz#30 from kipz/kipz/audit-shim-source-paths

7ffd37a

fix(mediation): record audit-shim source paths to survive PATH munging

drewmchugh force-pushed the am/linux-exec-filter-bpf-lsm branch 2 times, most recently from af1ff8b to 18ac8cb Compare May 7, 2026 14:01

drewmchugh changed the title ~~feat(linux): BPF-LSM mediation filter (closes exec + read bypasses)~~ feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses) May 7, 2026

drewmchugh and others added 2 commits May 7, 2026 17:05

drewmchugh force-pushed the am/linux-exec-filter-bpf-lsm branch from 0edfd36 to 7ccf135 Compare May 8, 2026 18:32

kipz force-pushed the develop branch from a4a6fd0 to 7bd6c2a Compare May 13, 2026 14:47

kipz force-pushed the develop branch from babb9ac to aa62580 Compare May 29, 2026 08:40

kipz force-pushed the develop branch 3 times, most recently from 86b464a to 4f89e43 Compare June 4, 2026 17:07

kipz force-pushed the develop branch from 4f89e43 to e3a8fd9 Compare June 15, 2026 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses)#26

feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses)#26
drewmchugh wants to merge 20 commits into
kipz:developfrom
drewmchugh:am/linux-exec-filter-bpf-lsm

drewmchugh commented Apr 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

drewmchugh commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How the kernel side decides

Bypass classes closed

Deployment requirements

Filesystem subtree deny (extension)

Why this exists

Kernel side (commit a5b6a85)

Userspace side (commit 00063f4)

Documentation (commit 378f12c)

Performance (microbench)

What changed vs #20

Validation

Documentation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

drewmchugh commented Apr 30, 2026 •

edited

Loading

Kernel side (commit `a5b6a85`)

Userspace side (commit `00063f4`)

Documentation (commit `378f12c`)