feat(cli): $REPO_ROOT variable expansion in profile paths#1041
feat(cli): $REPO_ROOT variable expansion in profile paths#1041amitds1997 wants to merge 4 commits into
Conversation
PR Review SummarySize
Affected crates
Blast radius — ModerateThis PR touches: source code,documentation Updated automatically on each push to this PR. |
There was a problem hiding this comment.
Code Review
This pull request introduces filesystem.discover_read to support ancestor-only read discovery by walking upward from a configured root directory to an optional inclusive ceiling. It updates the JSON schema, authoring guides, and documentation, integrates the discovered paths into the pre-exec trust scan, and adds comprehensive unit tests. The review feedback suggests optimizing the path discovery logic by canonicalizing the root and ceiling paths upfront, which avoids redundant filesystem I/O inside the ancestor traversal loop.
| fn discover_read_ancestor_entries( | ||
| rule: &FilesystemAncestorDiscovery, | ||
| workdir: &Path, | ||
| ) -> Result<Vec<(PathBuf, FilesystemAncestorEntryKind)>> { | ||
| let root = expand_vars(&rule.root, workdir)?; | ||
| if !root.exists() { | ||
| return Ok(Vec::new()); | ||
| } | ||
| if !root.is_dir() { | ||
| return Err(NonoError::ConfigParse(format!( | ||
| "filesystem.discover_read root '{}' resolved to '{}' which is not a directory", | ||
| rule.root, | ||
| root.display() | ||
| ))); | ||
| } | ||
|
|
||
| let ceiling = rule | ||
| .ceiling | ||
| .as_ref() | ||
| .map(|value| expand_vars(value, workdir)) | ||
| .transpose()?; | ||
| if let Some(ref ceiling) = ceiling | ||
| && !root.starts_with(ceiling) | ||
| { | ||
| return Err(NonoError::ConfigParse(format!( | ||
| "filesystem.discover_read ceiling '{}' resolved to '{}' which is not an ancestor of root '{}'", | ||
| rule.ceiling.as_deref().unwrap_or_default(), | ||
| ceiling.display(), | ||
| root.display() | ||
| ))); | ||
| } | ||
|
|
||
| for entry in &rule.entries { | ||
| validate_discover_read_entry_path(&entry.path)?; | ||
| } | ||
|
|
||
| let mut discovered = BTreeSet::new(); | ||
| let mut current = Some(root.as_path()); | ||
| while let Some(ancestor) = current { | ||
| for entry in &rule.entries { | ||
| let candidate = ancestor.join(&entry.path); | ||
| let matched = match entry.kind { | ||
| FilesystemAncestorEntryKind::File => candidate.is_file(), | ||
| FilesystemAncestorEntryKind::Dir => candidate.is_dir(), | ||
| }; | ||
| if matched { | ||
| discovered.insert((candidate, entry.kind)); | ||
| } | ||
| } | ||
|
|
||
| if ceiling.as_ref().is_some_and(|limit| { | ||
| match (ancestor.canonicalize(), limit.canonicalize()) { | ||
| (Ok(a), Ok(c)) => a == c, | ||
| _ => ancestor == limit, | ||
| } | ||
| }) { | ||
| break; | ||
| } | ||
| current = ancestor.parent(); | ||
| } | ||
|
|
||
| Ok(discovered.into_iter().collect()) | ||
| } |
There was a problem hiding this comment.
Avoid performing filesystem I/O (such as path canonicalization) inside loops to prevent performance issues or hangs if the filesystem is slow or unresponsive. Pre-compute or pre-canonicalize paths before entering the loop.
In discover_read_ancestor_entries, ancestor.canonicalize() and limit.canonicalize() are called inside the while loop on every iteration. Since limit (the ceiling) is constant throughout the loop, and any parent of a canonicalized path is also canonicalized, we can canonicalize both root and ceiling upfront before entering the loop. This allows us to perform a simple, fast ancestor == limit comparison inside the loop without any filesystem I/O.
fn discover_read_ancestor_entries(
rule: &FilesystemAncestorDiscovery,
workdir: &Path,
) -> Result<Vec<(PathBuf, FilesystemAncestorEntryKind)>> {
let root = expand_vars(&rule.root, workdir)?;
if !root.exists() {
return Ok(Vec::new());
}
if !root.is_dir() {
return Err(NonoError::ConfigParse(format!(
"filesystem.discover_read root '{}' resolved to '{}' which is not a directory",
rule.root,
root.display()
)));
}
let root = root.canonicalize().map_err(|source| {
NonoError::PathCanonicalization {
path: root.clone(),
source,
}
})?;
let ceiling = rule
.ceiling
.as_ref()
.map(|value| expand_vars(value, workdir))
.transpose()?;
let ceiling = if let Some(c) = ceiling {
let c_canon = c.canonicalize().map_err(|source| {
NonoError::PathCanonicalization {
path: c.clone(),
source,
}
})?;
if !root.starts_with(&c_canon) {
return Err(NonoError::ConfigParse(format!(
"filesystem.discover_read ceiling '{}' resolved to '{}' which is not an ancestor of root '{}'",
rule.ceiling.as_deref().unwrap_or_default(),
c_canon.display(),
root.display()
)));
}
Some(c_canon)
} else {
None
};
for entry in &rule.entries {
validate_discover_read_entry_path(&entry.path)?;
}
let mut discovered = BTreeSet::new();
let mut current = Some(root.as_path());
while let Some(ancestor) = current {
for entry in &rule.entries {
let candidate = ancestor.join(&entry.path);
let matched = match entry.kind {
FilesystemAncestorEntryKind::File => candidate.is_file(),
FilesystemAncestorEntryKind::Dir => candidate.is_dir(),
};
if matched {
discovered.insert((candidate, entry.kind));
}
}
if ceiling.as_ref().is_some_and(|limit| ancestor == limit) {
break;
}
current = ancestor.parent();
}
Ok(discovered.into_iter().collect())
}References
- Avoid performing filesystem I/O (such as path canonicalization) inside display or diagnostic rendering loops to prevent performance issues or hangs if the filesystem is slow or unresponsive. Pre-compute or pre-canonicalize paths before entering the loop.
Canonicalize root and ceiling once before the ancestor loop so ceiling checks do not call canonicalize() on every iteration. Adjust unit tests to expect canonical discovered paths. Addresses PR always-further#1041 review feedback. Signed-off-by: Amit Singh <amitds1997@gmail.com>
|
I think this solution adds too much complexity. What about a simple feature for e.g: |
That's already supported? Could you explain this a bit more in case I'm misunderstanding something. For example, I use this right now: "read_file": [
"$HOME/.CFUserTextEncoding",
"$WORKDIR/../AGENTS.md",
"$WORKDIR/../../AGENTS.md"
],and that resolves correctly. I just want to avoid having to hard-code each nesting level. Open to restructure this PR with better approaches |
|
Hi @amitds1997, I would remove the trust-scan integration from this PR and keep it focused on filesystem grants. Folding ancestor discovery into trust scanning has quite a few extra edge cases, for example which Also, do you need directory grants for this use case?
{ "path": "AGENTS.md", "kind": "file" }I’m comfortable with this shape because the user is explicitly saying that ancestor files with that relative path are ok for the sandbox to read. With discovered dirs: the resulting recursive directory grant is conditional on launch location and filesystem layout. Depending on That may still be a valid feature, but it is a broader dynamic recursive grant than the original ancestor-file need. I’d prefer this PR be file-only unless there is a specific directory use case we want to review separately. Possible alternativeI thought a bit more about this, and a simpler alternative may be to expose a repo-root variable instead of adding ancestor discovery. For example, from a nested workdir , git rev-parse --show-toplevel
# /Users/lukehinds/dev/nononono could expose that as something like
or via a flag:
Then profiles can stay explicit and use existing This avoids hard-coding $WORKDIR/../... nesting levels without introducing dynamic ancestor discovery or recursive discovered directory grants. It also keeps the final granted paths explicit in the profile. |
|
@lukehinds Agree on both aspects, do you want me to limit this PR to just the |
|
I think $REPO_ROOT is a good approach, it means someone never need worry about setting ceilings etc |
|
I want to note that top-level resolution does not work for Git worktrees. In this case, we should also allow listing |
|
Thanks both, I'll adjust the issue and the PR accordingly. Might take me a couple of days to get to it, though. I'll convert this back to draft until then. |
Adds support for $REPO_ROOT variable expansion in all profile path fields (filesystem.*, unix_socket fields, etc.), enabling portable profiles for monorepos and workspaces. The repository root is resolved via this priority: 1. --repo-root flag or NONO_REPO_ROOT environment variable (explicit override) 2. Auto-detection via `git rev-parse --show-toplevel` from the current workdir 3. Left unexpanded when not in a git repo (path silently skipped) Resolution correctly handles: - Normal git repositories - Linked worktrees (non-bare and bare-with-worktrees layouts) - Nested repositories (innermost takes precedence) - Non-git directories (graceful no-op) Changes: - Add --repo-root flag and NONO_REPO_ROOT env var to SandboxArgs, WrapSandboxArgs, WhyArgs - Implement resolved_repo_root() with auto-detection and explicit override logic - Update expand_vars() to accept optional repo_root parameter - Update all expand_vars() call sites in capability_ext, command_runtime, profile_cmd, etc. - Update profile authoring guide with $REPO_ROOT docs and monorepo example - Add 11 test cases covering explicit flags, auto-detection, worktrees, nested repos - Add 2 integration tests in capability_ext for profile-level $REPO_ROOT expansion Backward compatible: profiles without $REPO_ROOT are unaffected. Signed-off-by: Amit Singh <amitds1997@gmail.com>
29fb5ef to
212f5ff
Compare
|
Took me longer than expected to get to this, but it's now ready for review. I've updated both the issue and the PR that the goal is now to handle I have been using it for the past day and I have not run into any "filesystem": {
"read": [
"$HOME/.config/gh", // GitHub CLI config directory
"$REPO_ROOT",
"$REPO_ROOT/../.git",
"$REPO_ROOT/../.bare",
]
},
"env_credentials": {
"env://GITHUB_TOKEN": "GITHUB_TOKEN" // For GitHub CLI
},I use a |
…ag parity - Reduce doc comment continuation indent in profile/mod.rs to satisfy clippy::doc_overindented_list_items lint - Run cargo fmt to fix long #[arg(...)] attribute lines in cli.rs, and expand_vars call lines in profile_runtime.rs and capability_ext.rs - Add --repo-root flag entry to docs/cli/usage/flags.mdx to satisfy the RunArgs docs parity check Signed-off-by: Amit Singh <amitds1997@gmail.com>
…ELOG conflict Keep [Unreleased] $REPO_ROOT entry from this branch on top of the [0.63.0] release notes brought in from upstream. Signed-off-by: Amit Singh <amitds1997@gmail.com>
6a02254 to
1b884e6
Compare
Upstream added expand_profile_set_vars in the merge, which called expand_vars with 2 args. Our branch extended the signature with a third repo_root argument. Pass None since env var value expansion does not require $REPO_ROOT resolution. Signed-off-by: Amit Singh <amitds1997@gmail.com>
1b884e6 to
1f449f1
Compare
Linked Issue
Closes #1036
Summary
Adds support for
$REPO_ROOTvariable expansion in all profile path fields (filesystem.*, unix_socket fields, etc.), enabling portable profiles for monorepos and workspaces.The repository root is resolved via this priority:
--repo-rootflag orNONO_REPO_ROOTenvironment variable (explicit override)git rev-parse --show-toplevelfrom the current workdirThis allows profiles to reference files at the repository root without hardcoding
../levels, especially useful when running from nested package directories in monorepos.Resolution Features
Changes
--repo-rootflag andNONO_REPO_ROOTenv var toSandboxArgs,WrapSandboxArgs,WhyArgsresolved_repo_root()with auto-detection and explicit override logicexpand_vars()to accept optionalrepo_rootparameterexpand_vars()call sites across the codebase$REPO_ROOTdocumentation and monorepo exampleBackward Compatibility
Profiles without
$REPO_ROOTare completely unaffected.Test Plan
Checklist
CHANGELOG.mdif needed