feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses)#26
Open
drewmchugh wants to merge 20 commits into
Open
feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses)#26drewmchugh wants to merge 20 commits into
drewmchugh wants to merge 20 commits into
Conversation
eddfce2 to
1c0993f
Compare
Introduces a mediation framework that intercepts command execution within sandboxed sessions, requiring admin approval for sensitive operations. Includes audit trail logging, Unix socket control server, and per-command sandbox shims. Signed-off-by: James Carnegie <me@kipz.org>
Adds a macOS menu bar application that discovers active nono sessions and provides a UI for reviewing and approving/denying mediated commands. Signed-off-by: James Carnegie <me@kipz.org>
Signed-off-by: James Carnegie <me@kipz.org>
Commands like `gh` authenticate via macOS Keychain using mach-lookup IPC to securityd. The per-command Seatbelt sandbox blocks these mach-lookups by default, causing 401 auth failures when gh tries to retrieve its stored GitHub token. Add `keychain_access: bool` to CommandSandbox. When true, the per-command sandbox grants read access to keychain DB files (login.keychain-db, metadata.keychain-db), which triggers the existing mach-lookup deny skip in the Seatbelt profile generator. This allows commands to authenticate via keychain while keeping all other sandbox restrictions (filesystem, network allowed_hosts) intact. Also documents that the Approve action intentionally runs without a per-command sandbox (None) because Seatbelt blocks keychain mach-lookup even with keychain file grants insufficient for all auth paths. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
After the explicit pass, `profile::expand_vars` now resolves any remaining
`$VAR` / `${VAR}` tokens (uppercase+underscore+digit names) from the
process environment. Unset vars are left literal, matching the existing
`$XDG_RUNTIME_DIR` fallback so downstream `add_sandbox_*` helpers log
"does not exist, skipping" rather than failing the session.
Also switches per-command mediation sandbox paths from `expand_home`
to `expand_vars` so `$WORKDIR`, `$HOME`, XDG vars, and any launch-time
env var (e.g. a caller-provided `$GIT_ROOT`) resolve consistently in
both the main and per-command sandboxes. `SessionCtx` gains a `workdir`
field threaded from `execution_runtime` through `session::setup` and
`server::run` so per-command expansion uses the same workdir as the
rest of nono.
`expand_vars` already resolved $VAR / ${VAR} tokens in top-level
filesystem paths, policy.* paths, command_args, and per-command
sandbox paths. Extend the same expansion to the remaining
user-authored string fields in the profile:
- `mediation.commands[].intercept[].args_prefix` — the intercept
matcher compared each entry literally against the incoming argv,
so `"$USER"` used to be a dead string. Expanding at profile-load
time lets authors write session-aware matchers (e.g. the macOS
Keychain `security find-generic-password <user> ...` rule) without
install-time `sed` substitutions over the profile file.
- `mediation.commands[].binary_path` — consumed verbatim by `PathBuf::from`,
now expanded so profiles can point at user-specific binaries like
`$HOME/.local/bin/tool`.
- `network.custom_credentials[].{tls_ca,tls_client_cert,tls_client_key}` —
previously used the narrower `policy::expand_path` which only handled
`~`, `$HOME`, `$TMPDIR`. Switched to the full `expand_str` so any
configured env var (e.g. `$XDG_CONFIG_HOME`) resolves.
Introduces `profile::expand_str` as the string-returning core of the
expansion pipeline. `expand_vars` now delegates to it. Threads the
session `workdir` through `resolve_credentials`, `build_proxy_config_from_flags`,
`start_proxy_runtime`, and `resolve_command` so `$WORKDIR` resolves
consistently across all expansion sites.
Stamps each audit.jsonl entry with session_id, session_name, nono_pid, sandboxed_pid, and command_pid so operators can correlate commands across a session and trace the full process hierarchy. Process hierarchy per log entry: nono_pid — the nono supervisor (unsandboxed parent) sandboxed_pid — the direct child process nono sandboxed (e.g. claude, codex) command_pid — the shim process that ran the specific command (e.g. echo, git) The session_id/session_name are pre-generated in execution_runtime before mediation setup so audit.jsonl and the session record share the same values. sandboxed_pid is resolved after fork via an Arc<OnceLock<u32>> latch shared between the mediation server and the on_fork callback. ShimRequest gains a pid field so the mediated path can record command_pid. The shim's AuditEvent also gains command_pid for the audit-only datagram path. All new AuditEvent fields use #[serde(default)] for backward compatibility with older shims that do not send them. Signed-off-by: Christine Le <christine.le@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mediated passthrough previously buffered stdio: the shim read stdin with a 50ms timeout into a UTF-8 String, the server spawned the real binary with Stdio::piped() + wait_with_output(), and the parent's ChildStdin dropped before the child could read. ssh/git over a binary pipe hit SIGPIPE; every long-running mediated command (gh, kubectl, ...) silently buffered output until exit. The shim now sends its stdin/stdout/stderr fds via SCM_RIGHTS after the JSON request. The server passes them straight to the real binary via Stdio::from(...) for no-intercept passthrough (and the allow_commands sub-branch and admin_passthrough), then wait()s. Capture/Respond/Approve drop the passed fds and keep the existing buffered behaviour so they can still inspect or transform the output. The shim's stdin field is removed from ShimRequest (so was the read_stdin_nonblocking helper); shim and server are versioned together, no wire compatibility is needed. Tests: added a streaming socketpair harness, a binary-roundtrip test covering 0xFF bytes through stdin/stdout, and a Respond-path test that verifies the dropped fds let the test side see EOF.
… command Adds an optional CallerPolicy on each CommandEntry: - agent_allowed: bool (default true) — whether the primary sandbox (no NONO_SANDBOX_CONTEXT) may invoke this command. - allowed_parents: Option<Vec<String>> — restrict which mediated parents may invoke. None (field absent) accepts any parent (existing behaviour). Some([]) blocks every mediated parent. Some(["git"]) permits only the listed parents. The gate fires at the top of apply(), before the existing allow_commands skip-intercepts branch, so a parent allowed by allowed_parents still flows through unchanged. Rejected calls return exit 126 with action_type "denied". Defaults preserve full backward compatibility: a profile with no caller_policy field on a command keeps current behaviour (agent + any parent allowed). Use case: ssh / ssh-keygen in shadowfax — set agent_allowed=false, allowed_parents=["git"] so a malicious prompt cannot invoke them directly to authenticate to attacker-controlled hosts (or sign arbitrary data) using the user's keys via the per-command sandbox.
Signed-off-by: James Carnegie <me@kipz.org>
Sending three separate sendmsg(SCM_RIGHTS) calls for stdin/stdout/stderr fails with EMSGSIZE (os error 40) on macOS when the socket receive buffer already holds a large JSON request — as happens when git is invoked from within gh's execution sandbox, which adds proxy and session env vars. Replace the three individual sends with a single sendmsg carrying all three fds in one SCM_RIGHTS control message, and update recv_three_fds to receive them in a matching single recvmsg call. This is the standard idiom for passing multiple fds and avoids the macOS control-buffer constraint entirely. Fixes: DataDog/shadowfax#95
The shim now sends its own cwd in `ShimRequest.cwd`, and the server sets it on the spawned binary via `Command::current_dir`. Without this, the spawned binary inherited the mediation server's launch cwd. Tools that resolve config from cwd silently operated on the wrong target — `git` in a worktree being the canonical case: discovery would walk up from the server's cwd, find the wrong `.git`, and report the wrong branch and toplevel. The new field is `Option<String>` with `#[serde(default)]`: - old shim → new server: missing field → legacy behaviour (server cwd) - new shim → old server: extra field is ignored - unreadable cwd: shim sends None → legacy behaviour - non-directory cwd: server logs a warning and falls back to its own cwd Adds `test_passthrough_uses_request_cwd`: drives `apply()` end-to-end with a real `/bin/pwd` and asserts the spawned process prints the caller's cwd, not the server's.
The mediation/* and nono-shim formatting drift accumulated across the 13- commit mediation series. Folding the fmt fixes into the originating commits caused conflicts because subsequent mediation commits re-touched the same lines, so capture the cumulative fmt result here instead. Signed-off-by: James Carnegie <me@kipz.org>
3 tasks
When nono creates an audit shim for a binary at session start, the shim later runs `resolve_real_binary` which re-walks PATH to find the real target. Intermediate shells (e.g. husky pre-commit hooks, lint-staged workers) often munge PATH between session start and shim invocation, stripping user toolchain dirs that contained the real binary. The walk then returns nothing and the shim reports `nono-shim: <name>: command not found` even though the real binary is still on disk at the path nono saw at session start. This change makes audit shim resolution deterministic by recording the resolved absolute path at session start and consulting it first at exec time: - A new sibling dir `<session_dir>/shim-sources/` holds one sidecar per shim — `<name>` containing the absolute path of the binary the shim was created for. Both mediated commands and universal audit shims write a sidecar. - `SessionHandle` exposes `shim_sources_dir` and the path is forwarded to mediated subprocesses via a new `NONO_SHIM_SOURCES_DIR` env var (alongside `NONO_SHIM_DIR`). The session-level dir is reused for filtered per-command sandboxes — sidecars are created once and shared, since the recorded paths do not change. - `nono-shim::resolve_real_binary` now consults `NONO_SHIM_SOURCES_DIR/<name>` first and only falls back to the existing PATH walk if the sidecar is missing or its recorded path is no longer an executable file. Tests: - 8 new unit tests in nono-shim cover sidecar hits, missing dirs, trimming, deleted/non-executable targets, and the PATH-walk fallback. - 2 new unit tests in mediation::session cover sidecar writes and overwrites. - The existing `test_allow_commands_sets_nono_shim_dir_to_filtered_dir` test is extended to assert `NONO_SHIM_SOURCES_DIR` is forwarded unchanged into the per-command sandbox env.
fix(mediation): record audit-shim source paths to survive PATH munging
…ediation foundation Transplants the BPF-LSM mediation and protected-subtree work from drewmchugh:am/linux-exec-filter-bpf-lsm onto kipz/develop (v0.47.0). Key changes: - Add crates/nono/src/bpf/ (mediation.bpf.c, vmlinux.h) and the libbpf-rs loader (bpf_lsm.rs) + audit ring-buffer reader (bpf_audit.rs) - Update nono/Cargo.toml + build.rs to compile the BPF program at build time - Update sandbox/mod.rs to expose bpf_lsm + bpf_audit modules - Add nono-cli bpf-lsm feature that propagates nono/bpf-lsm - Wire protected_paths field through PreparedSandbox → ExecutionFlags → SupervisedRuntimeContext → exec_strategy::SupervisorConfig - Add mediation_filter_state() helper in execution_runtime to extract deny_set + shim/audit dirs from SessionHandle - Fix mediation/policy.rs Sandbox::apply() return type mismatch (pre-existing develop bug: Linux returns SeccompNetFallback not ()) - Add bpf_lsm BPF-LSM skip in policy::validate_deny_overlaps - Add bpf_lsm_protected_roots_for_session() in protected_paths (reads filesystem.deny from new schema) - Add integration tests: bpf_lsm_integration, bpf_lsm_protected_subtree - Add smoke tests: bpf_lsm_smoke - Add qa-profiles/04-allow-parent-of-protected.json sample profile - Add docs/linux-bpf-lsm-mediation.md design doc Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>
…t wiring - Add protected_paths: Vec::new() to PreparedSandbox test initializers in main.rs - Fix clippy unwrap_used lint in mediation/mod.rs tests (unwrap → expect) - Fix clippy loop-index lint in mediation/server.rs (for i in 0..n → enumerate) - Run cargo fmt to fix whitespace in bpf_lsm.rs and mediation/policy.rs Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>
af1ff8b to
18ac8cb
Compare
BPF audit records for non-mediated execs had empty command and path fields. The audit_record struct carried only (dev, ino); userspace resolved those to a path via inode_to_path, which is built from the mediation deny set. Non-deny-set binaries were not in that table so both fields were left empty. Fix: add char filename[256] to struct audit_record and populate it in check_exec via bpf_probe_read_kernel_str(bprm->filename). On the Rust side, decode the field as a PathBuf fallback in handle_record when the inode_to_path lookup returns nothing. The deny-set lookup still wins when present (canonical path). As a side effect, the shim-suppression check now correctly fires for shim-routed commands whose path_str was previously always empty. Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>
The Approve action previously dropped the shim's stdin/stdout/stderr
fds and ran the real binary (or script) with null stdin and piped
stdout/stderr buffered into ShimResponse. This worked for the original
use cases — ddtool auth gitlab login, ddtool auth token X — because
none of them read stdin and their stdout is just session-info prints
that the shim relays to its caller's stdout transparently.
It breaks for any binary that follows the Docker credential-helper
protocol, where the caller writes the registry hostname to stdin and
expects JSON creds back on stdout. docker-credential-ddtool launched
through Approve was getting empty stdin and returning "no credentials
server URL", which surfaces in Bazel's rules_oci_bootstrap as:
failed to run credential helper, stdout: no credentials server URL
This was first hit on Datadog Linux workspaces where the Docker
auth flow is multi-process (docker-credential-ddtool -> pass -> gpg)
and can't be cleanly held inside a per-command sandbox, forcing the
choice of Approve to escape the sandbox — exposing the gap.
Switch Approve to streaming mode: pass the shim's stdio fds into
exec_passthrough/exec_script via Stdio::from(...), call wait()
instead of wait_with_output(), return an empty ShimResponse stdout/
stderr (only exit code matters; the binary's output streamed
directly to the caller). Buffered mode is preserved for Respond
and Capture, which still need to inspect stdout (Capture issues a
nonce in place of it).
Behavior change for existing Approve usages: unchanged in practice.
None of them read stdin, so connecting stdin doesn't break anything.
They write to stdout, which previously was buffered into ShimResponse
and relayed back; now it streams directly with the same end result
for the user. The only observable delta is that nono's audit log
entries for Approve actions no longer carry stdout/stderr content.
Add stdio_fds parameter to exec_script. Capture's script path passes
None (preserves buffered behaviour for nonce issuance); Approve's
script path passes Some(stdio_fds) for streaming.
Verified: bzl build of an internal Datadog target inside a sandboxed
session now completes through the docker-credential-helper path.
Signed-off-by: Andrew McHugh <andrew.mchugh@datadoghq.com>
0edfd36 to
7ccf135
Compare
86b464a to
4f89e43
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the seccomp-unotify exec filter from #20 with a BPF-LSM mediation filter that moves the mediation decision into the kernel (closing the TOCTOU race a sibling thread sharing the trapped task's memory could exploit), and adds a
file_openLSM hook so the agent cannot read the mediated binary's bytes at all — closing copy-the-binary, ld-linux trick, unprivileged-tmpfs, and shellcode bypasses in addition to the original direct-path exec bypass #20 was scoped to.This is a fork-and-replace of #20 rather than a follow-on commit: the seccomp-unotify filter, its supervisor handler, and the shebang-chain walker are deleted; BPF-LSM is the sole enforcement path for mediated commands. Sessions where mediation is active hard-fail at startup if BPF-LSM isn't installable, surfacing the specific reason (missing kernel cmdline / caps).
How the kernel side decides
Two BPF-LSM hooks share a single
(dev, ino)deny map and a cgroup-ancestor scope check:bprm_check_security— fires after the kernel has resolved the binary the call will actually load (bprm->file); a-EACCESreturn atomically aborts the exec syscall, with no race against any user-memory pointer the agent controlled.file_open— fires insidedo_filp_openfor every successful path resolution that yields a file descriptor; denying opens of mediated inodes prevents the agent from reading the binary's bytes through any path (cp / cat / mmap / splice / dynamic linker).Per-session scoping by cgroup ancestry: the broker creates a per-session cgroup at session start, the agent joins it as its first post-fork action, and the BPF program walks the calling task's cgroup ancestor chain looking for the session cgroup id. Cgroup membership inherits on
fork()and is unaffected by reparenting, so daemonized agent descendants stay scoped. Ancestry (not equality) closes a sub-cgroup-escape where an agent withCAP_SYS_ADMINcouldmkdira child cgroup of the session and move into it.Bypass classes closed
/usr/bin/ghbprm_check_securitydenies the execbprm->file, not user-memory pointersfile_opendenies the read of mediated bytes/lib/ld-linux-x86-64.so.2 /usr/bin/ghfile_opendenies the linker'sopen()file_opendenies the read at the sourcefile_opendenies the source read(dev, ino), hardlinks share the inodefile_openwalks dentry parents againstprotected_rootsfile_openpath (open-for-write)file_open(mmap setup opens the file)inode_*hooks return-EACCESinode_create/inode_mkdir/inode_symlink/inode_linkinode_setattrprotected_rootsThe vfork-bomb residual #20 left open empirically went from 49/600 bypasses (under the seccomp-unotify filter on this branch with cgroup scoping but no BPF) to 0/600 under BPF-LSM. The pthread variant remains 0/600.
Deployment requirements
lsm=...,bpfin kernel cmdline. The host kernel needsCONFIG_BPF_LSM=y; verify viagrep bpf_lsm_bprm_check_security /proc/kallsyms.setcap cap_bpf,cap_sys_admin,cap_dac_override+ep /usr/bin/nono. CAP_BPF for the BPF program load; CAP_SYS_ADMIN for the cgroup-namespace privilege check onmkdir; CAP_DAC_OVERRIDE because cgroup v2mkdirruns the VFS DAC check before the cgroup-namespace privilege check, and CAP_SYS_ADMIN does not subsume DAC.PR_SET_NO_NEW_PRIVS=1set before agent execve (existing nono behaviour) so file caps don't apply. The broker also verifiesCapEff: 0on the agent post-execve and emits an explicit invariants log line at session start.If any requirement is unmet, the broker fails session start with an explicit error pointing at the specific fix (kernel cmdline / caps / config). There is no silent partial enforcement.
Filesystem subtree deny (extension)
The first BPF-LSM hook pair gates binaries by inode. The same infrastructure now extends to gating filesystem subtrees — the same problem macOS Seatbelt solves with
(deny file-write* (subpath ...))rules.Why this exists
Three pre-existing Linux gaps motivated this:
validate_caps_against_protected_rootshard-rejected on Linux when a profile granted a parent of~/.nono(e.g.--workdir=$HOME), because Landlock can't express deny-within-allow. macOS handled the same case via Seatbelt deny rules. The user-visible symptom (Jira WRK-2585):claudefrom$HOMEon a workspace fails sandbox init withnono: Sandbox initialization failed: Refusing to grant '/home/bits' (source: Profile) because it overlaps protected nono state root '/home/bits/.nono'.policy.add_deny_access(crates/nono-cli/src/profile/mod.rs:98) was a no-op on Linux —add_deny_access_rulesshort-circuited atif cfg!(target_os = "macos"). shadowfax's Linux profile declaresadd_deny_access: ["$HOME/.config/password-store/gpg"]and the deny was silently dropped.validate_deny_overlapsrejected any session whose allow set covered a required policy-group deny (deny_credentials,deny_keychains_linux, …), forcing Linux profiles to enumerate their allows narrowly enough to dodge the overlap.All three are now resolved by routing the deny-within-allow enforcement through BPF-LSM at the kernel level.
Kernel side (commit
a5b6a85)A new
protected_rootsBPF_MAP_TYPE_HASH(capacity 64, keyed by(dev, ino)).dentry_in_protected_subtreewalks up toMAX_DENTRY_DEPTH=16levels viaBPF_CORE_READond_parent, verifier-friendly with the same#pragma unrollshape as the existing cgroup walker.check_file_openextended to also consult the new map; eight new SEC blocks —inode_unlink,inode_rmdir,inode_rename,inode_create,inode_mkdir,inode_symlink,inode_link,inode_setattr— cover the structural mutations that don't go throughfile_open. Two new audit reason codes (protected_open_deny,protected_mutate_deny) carry through the existing ringbuf and JSONL pipeline.bpf_d_pathwas considered and rejected: it returns mount-namespace-aware paths (the wrong granularity for the bind-mount case below), and several of the new hooks operate ondentrydirectly with nostruct pathto hand to it.Userspace side (commit
00063f4)protected_paths.rs:target_os = "macos"gate dropped fromvalidate_requested_path_against_protected_roots. Withallow_parent_of_protected: trueset on the profile, both platforms admit the parent grant; the OS sandbox layer enforces the subtree at runtime (Seatbelt rules on macOS, BPF-LSM hooks on Linux). Newbpf_lsm_protected_roots_for_session(state_roots, add_deny_access, policy_group_denies)merges all three deny sources, runsadd_deny_accessstrings throughpolicy::expand_pathso$HOME/...and~/...placeholders become concrete paths the populator can stat (without the expansion the literal string would never make it into the kernel map — bug caught during end-to-end validation).policy.rs:validate_deny_overlapsno-ops on Linux when BPF-LSM is available — the kernel hooks return-EACCESfor any in-subtree access regardless of what Landlock allows, so the silent-drop motivation no longer holds. Hosts without BPF-LSM (nolsm=...,bpfin cmdline, nocap_bpf) still hit the existing rejection.bpf_lsm.rs:install_mediation_filtertakes aprotected_paths: &[PathBuf]slice. The populator stats each canonical path, inserts(dev, ino), then parses/proc/self/mountinfoto enumerate bind-mount source roots mounted at-or-under any protected path — needed because the dentry parent walk follows the source filesystem tree, not the mount tree (e.g. shadowfax's~/.nono/sessionsbind-mount). Both the directly-listed inode and the bind-mount source inode go into the same map; the BPF walker doesn't care which.exec_strategy.rs: BPF-LSM now installs when EITHER the exec deny set OR the protected_roots set is non-empty (a profile with nomediation.commandsbut withadd_deny_accessor any policy-group deny still gets enforcement).protected_pathsflows throughPreparedSandbox→ExecutionFlags→SupervisedRuntimeContext→SupervisorConfig→install_mediation_filter. Audit log dir falls back to~/.nono/sessionswhen only protected_paths is active (no exec mediation session).crates/nono-cli/tests/bpf_lsm_protected_subtree.rscovering read / write / mmap / unlink / rmdir / rename / create / bind-mount / regression /~/.nono/add_deny_access/ opt-in session start.Documentation (commit
378f12c)docs/linux-bpf-lsm-mediation.md: new## Filesystem subtree denysection between Performance and Deployment requirements, with hook coverage table, identity model, bind-mount handling, audit, performance numbers, and profile schema. Summary section linked.qa-profiles/04-allow-parent-of-protected.json: sample profile granting$HOMEwith the opt-in plusadd_deny_accessfor$HOME/.config/qa-secret. Used for manual validation.Performance (microbench)
cat /etc/passwdinside the sandbox: ~1879 ms (release nono).cat. Eachcatdoes ~50 file opens (loader resolving libs,/etc/passwditself), so ~2-3 µs per BPF-checkedfile_openin the busy path.claude -p hello: native 1.85 s, sandboxed 4.6 s. The 2.75 s overhead is dominated by the existing nono setup costs that PR feat(linux): linux BPF-LSM mediation filter (closes exec + read bypasses) #26 already incurred; the new BPF code is in the noise.make -jwith thousands of forks per second), where the existing nono machinery already dominates the budget.What changed vs #20
Deleted:
install_seccomp_exec_filterand its BPF program builder fromsandbox/linux.rshandle_exec_notification,classify_exec_path,ExecDecision,read_path_at,read_execve_argv_at,count_threadsfromsupervisor_linux.rsmediation/shebang.rs(the shebang-chain walker)mediation/filter_audit.rs(the seccomp-era audit emitter — replaced by a library-side reader for the BPF ringbuf)exec_notify_fdSCM_RIGHTS plumbing inexec_strategy.rsAdded (original BPF-LSM work):
crates/nono/src/bpf/mediation.bpf.c— two LSM programs sharing a deny map + scope check, plus aBPF_MAP_TYPE_RINGBUFfor audit recordscrates/nono/src/sandbox/bpf_lsm.rs— loader (install_mediation_filter,MediationFilterHandle,SessionCgroup)crates/nono/src/sandbox/bpf_audit.rs— userspace ring-buffer reader that appends JSONL events to~/.nono/sessions/audit.jsonlcrates/nono/tests/bpf_lsm_smoke.rs— load + attach unit testscrates/nono-cli/tests/bpf_lsm_integration.rs— 16 end-to-end tests organised into exec mediation / read mediation / audit / composition groups, with self-skip if the test binary lacks the BPF capsmake test-integration— build → setcap → cargo-test target so the integration suite actually runs in CIAdded (filesystem subtree deny extension):
crates/nono/src/bpf/mediation.bpf.cextended —protected_rootsmap,dentry_in_protected_subtree,file_in_protected_subtree, extendedcheck_file_open, eight newinode_*SEC blocks, two new audit reason codescrates/nono/src/sandbox/bpf_lsm.rsextended —MAX_PROTECTED_ROOTS=64, eight newLinkfields onMediationFilterHandle,attached_program_count(),protected_roots_map(),collect_protected_root_entries,bind_mount_sources_under,decode_mountinfo_octal, two newBpfLsmErrorvariantscrates/nono-cli/src/protected_paths.rsextended —bpf_lsm_protected_roots_for_session,target_os = "macos"gate droppedcrates/nono-cli/src/policy.rs—validate_deny_overlapsno-ops on Linux when BPF-LSM is availablecrates/nono-cli/src/sandbox_prepare.rs+ theExecutionFlags/SupervisedRuntimeContext/SupervisorConfigchain — threadprotected_pathsend-to-endcrates/nono-cli/tests/bpf_lsm_protected_subtree.rs— 12 new integration testsqa-profiles/04-allow-parent-of-protected.json— sample profileSchema change (
audit.jsonlentries from BPF):action_type:"allow_unmediated"/"deny"(noexec_filter_prefix)reason(only on deny):"exec_deny"/"open_deny"/"protected_open_deny"/"protected_mutate_deny"interpreter_chain: dropped (kernel resolves shebangs internally; BPF fires for the actually-loaded binary)exit_code: 126on deny, absent on allowThe shim's own
mediation::AuditEventshape is unchanged; both event shapes interleave in the same JSONL file.Validation
Performed end-to-end on the BPF-LSM workspace AMI (
am/bpf-lsm-workspace-amion dd-source — addslsm=...,bpfto the kernel cmdline). Each check was re-run after every phase:cargo clippy --workspace --all-targets --all-features -- -D warnings -D clippy::unwrap_usedcleancargo fmt --all -- --checkcleancargo test -p nono --lib— 628/628 passsudo -E cargo test -p nono --test bpf_lsm_smoke— 7/7 pass (5 original + 2 new for protected_roots)make test-integration— 16/16 (original) pass, plus 12/12 new inbpf_lsm_protected_subtreecargo test -p nono-cli --test bpf_lsm_integration(no caps) — self-skip cleanly with informative messagescargo test -p nono-cli --bins protected_paths— 9/9 unit tests pass (including the rewritten Linux opt-in cases)claudeend-to-end againstqa-profiles/04-allow-parent-of-protected.jsonfrom$HOME: session starts cleanly, all deny scenarios fire, audit recordsprotected_open_denyandprotected_mutate_denyeventsclaudeend-to-end against the shadowfax production Linux profile (DataDog/shadowfax/deployments/linux/profiles/claude.json) extended withallow_parent_of_protected: true: session starts cleanly,claude -p "what is 2+2?"returns "4" (no productivity regression), all deny scenarios fire including~/.config/password-store/gpg/keyfromadd_deny_accessDocumentation
docs/linux-bpf-lsm-mediation.mddocs/linux-exec-filter-bpf-lsm-impl-decisions.mddocs/linux-exec-filter-vfork-decisions.mddocs/archive/linux-exec-filter-plan-seccomp-era.mdTest plan
am/bpf-lsm-workspace-amiand confirmscat /sys/kernel/security/lsmincludesbpfmake build-release && sudo setcap cap_bpf,cap_sys_admin,cap_dac_override+ep target/release/nonosudo -E cargo test -p nono --test bpf_lsm_smoke— expect 7 pass (5 original + 2 new)make test-integration— expect 16 (original) + 12 (newbpf_lsm_protected_subtree) = 28 pass~/.nono/sessions/audit.jsonlreceivesallow_unmediatedanddeny/open_denyentries with the expected schemaqa-profiles/04-allow-parent-of-protected.json, verifyprotected_open_denyandprotected_mutate_denyaudit events fire when the agent reads/writes$HOME/.config/qa-secret🤖 Generated with Claude Code