Skip to content

fix(sandbox): resolve paths in read_file/write_file content for LocalSandbox#1935

Open
JasonOA888 wants to merge 4 commits intobytedance:mainfrom
JasonOA888:fix/issue-1778-local-sandbox-path-consistency
Open

fix(sandbox): resolve paths in read_file/write_file content for LocalSandbox#1935
JasonOA888 wants to merge 4 commits intobytedance:mainfrom
JasonOA888:fix/issue-1778-local-sandbox-path-consistency

Conversation

@JasonOA888
Copy link
Copy Markdown
Contributor

Fixes #1778

Problem

In LocalSandbox mode, read_file and write_file do not transform container paths in file content. This causes issues when:

  1. Agent writes a Python script with /mnt/user-data/... paths
  2. Agent executes the script via bash tool
  3. Script fails because the paths are virtual, not real system paths

The bash tool already resolves paths in commands and reverse-resolves in output, but file tools do not.

Solution

  • write_file: resolve virtual paths in content to system paths before writing
  • read_file: reverse-resolve system paths back to virtual paths in returned content

This makes path handling consistent across all LocalSandbox tools.

Example

Before:

# Agent writes file with virtual path
write_file("/mnt/user-data/workspace/test.py", "open("/mnt/user-data/outputs/out.txt")")
# bash executes: python /mnt/user-data/workspace/test.py
# → FAILS: /mnt/user-data/outputs does not exist

After:

# write_file converts content paths to system paths
write_file("/mnt/user-data/workspace/test.py", "open("/mnt/user-data/outputs/out.txt")")
# File now contains real path, e.g. "/home/user/deer-flow/user-data/outputs/out.txt"
# bash executes: python /mnt/user-data/workspace/test.py
# → WORKS: output path resolves correctly

Notes

  • Same transformation logic as execute_command uses
  • read_file reverses the transformation so agents see consistent virtual paths
  • No change to non-LocalSandbox modes

…Sandbox

In LocalSandbox mode, read_file and write_file now transform
container paths in file content, matching the path handling
behavior of bash tool.

- write_file: resolves virtual paths in content to system paths
  before writing, so scripts with /mnt/user-data paths work
  when executed
- read_file: reverse-resolves system paths back to virtual
  paths in returned content for consistency

This fixes scenarios where agents write Python scripts with
virtual paths, then execute them via bash tool expecting the
paths to work.

Fixes bytedance#1778
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes inconsistent path handling in LocalSandbox by making read_file/write_file transform virtual container paths inside file contents, aligning their behavior with execute_command’s path resolution so scripts written with /mnt/user-data/... paths can run correctly on the host.

Changes:

  • write_file: resolves container-path strings inside content to host-local paths before writing.
  • read_file: reverse-resolves host-local paths back to container paths in returned file content.

Comment on lines +299 to +304
# Resolve container paths in content to local paths
# so that written files use correct system paths when executed
resolved_content = self._resolve_paths_in_command(content)
mode = "a" if append else "w"
with open(resolved_path, mode, encoding="utf-8") as f:
f.write(content)
f.write(resolved_content)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows hosts, self._resolve_path() will typically return paths with backslashes (because LocalSandboxProvider stores local_path as Path(...).resolve()), so resolved_content can inject sequences like C:\Users\... into source files. In common cases (e.g. Python scripts), those backslashes inside string literals can create invalid escape sequences (e.g. \U...) and break execution. Consider normalizing resolved paths inserted into file content to a safer representation (e.g. forward slashes) and ensure read_file’s reverse-resolver also recognizes that normalized form so agents don’t see leaked host paths.

Copilot uses AI. Check for mistakes.
Comment on lines +299 to 302
# Resolve container paths in content to local paths
# so that written files use correct system paths when executed
resolved_content = self._resolve_paths_in_command(content)
mode = "a" if append else "w"
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_resolve_paths_in_command() is documented and tuned for shell command parsing (boundary chars, quoting, etc.), but it’s now being reused to transform arbitrary file contents. That coupling is easy to miss and makes future tweaks to command parsing potentially break file IO semantics. Consider extracting a dedicated helper for “resolve paths in text content” (or renaming/generalizing the helper) so the intent and constraints are explicit.

Copilot uses AI. Check for mistakes.
Comment on lines +279 to +286
@@ -280,7 +280,10 @@ def read_file(self, path: str) -> str:
resolved_path = self._resolve_path(path)
try:
with open(resolved_path, encoding="utf-8") as f:
return f.read()
content = f.read()
# Reverse resolve local paths back to container paths in content
# for consistency with bash tool output path handling
return self._reverse_resolve_paths_in_output(content)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters core LocalSandbox behavior by rewriting paths inside written file content and rewriting them back when reading. There are existing unit tests for path mapping / command replacement in test_local_sandbox_provider_mounts.py, but nothing asserting the new read/write content transformations. Adding focused tests for write_file (content path resolution) and read_file (reverse resolution) would help prevent regressions, especially across different mount mappings.

Copilot uses AI. Check for mistakes.
@WillemJiang
Copy link
Copy Markdown
Collaborator

@JasonOA888 thanks for the contribution. Please take some time to address the Copilot review comments.

…orward-slash safety + tests

- Extract _resolve_paths_in_content() separate from _resolve_paths_in_command()
  to decouple file-content path resolution from shell-command parsing
- Normalize resolved paths to forward slashes to avoid Windows backslash
  escape issues in source files (e.g. \U in Python string literals)
- Add 4 focused tests: write resolves content, forward-slash guarantee,
  read reverse-resolves content, and write→read roundtrip
@JasonOA888
Copy link
Copy Markdown
Contributor Author

@WillemJiang Thanks for the nudge! I have addressed all three Copilot review points in commit 19c3f0d:

  1. Extracted _resolve_paths_in_content() — separate from _resolve_paths_in_command() to decouple file-content resolution from shell-command parsing.
  2. Forward-slash normalization — resolved paths in content always use / to avoid Windows backslash escape issues (e.g. \U in Python string literals).
  3. Added 4 focused tests — write resolves content, forward-slash guarantee, read reverse-resolves, and write→read roundtrip consistency.

Please take another look when you get a chance.

@WillemJiang
Copy link
Copy Markdown
Collaborator

@JasonOA888, please run the following command to fix the lint error.

cd backend
make format

@WillemJiang WillemJiang added the question Further information is requested label Apr 8, 2026
@WillemJiang
Copy link
Copy Markdown
Collaborator

@JasonOA888
read_file reverse-resolves ALL file content, not just agent-written files (significant)

read_file now unconditionally applies _reverse_resolve_paths_in_output to everything it reads. This means:

  • User-uploaded files containing paths that happen to match a local_path mapping will be silently rewritten
  • Files generated by external tools (formatters, linters, build systems) that emit real filesystem paths will have those paths replaced with virtual ones
  • If someone reads /etc/fstab or a log file, any path matching a mapping gets rewritten

Example: A user uploads a config containing /Users/jiangning/.../user-data/outputs — if that local path maps to /mnt/user-data, read_file would corrupt it to /mnt/user-data/outputs.

This is different from bash output, which is ephemeral. File content is persistent and read for many reasons (searching, parsing, forwarding to users).

Suggestion: Consider only reverse-resolving in read_file if the file was previously written through write_file (e.g., track a set of "agent-written" paths). Or accept the risk and document it clearly.

read_file previously applied _reverse_resolve_paths_in_output to ALL
file content, which could silently rewrite paths in user uploads and
external tool output (Willem Jiang review on bytedance#1935).

Now tracks files written through write_file in _agent_written_paths.
Only those files get reverse-resolved on read. Non-agent files are
returned as-is.
@JasonOA888
Copy link
Copy Markdown
Contributor Author

@WillemJiang Great catch. You're right — unconditionally reverse-resolving in read_file is too aggressive.

Pushed a fix (beb4e20) that addresses this:

  1. _agent_written_paths setwrite_file now tracks every path it writes to
  2. Conditional reverse-resolveread_file only applies _reverse_resolve_paths_in_output if the file was previously written through write_file
  3. Non-agent files untouched — user uploads, external tool output, config files, logs — all returned as-is

Added a test that verifies: writing a file directly to the filesystem (simulating a user upload) with a local path → read_file returns content unchanged.

This preserves the roundtrip behavior for agent-authored content while protecting everything else from silent path corruption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LocalSandbox模式下各工具处理路径的逻辑不一致导致找不到文件和路径

3 participants