Skip to content

Define and align symlink semantics across local, SFTP, and SFTP2 #660

Description

@bbtfr

Context

While reviewing the SFTP/SFTP2 stat-reuse PR, we rechecked symlink semantics across local, SFTP, and SFTP2. Some small bugs are being handled in the stat-reuse PR, but broader symlink behavior is inconsistent and should be discussed/fixed separately.

Expected direction from the discussion:

  • A user-provided root path that is a symlink to a directory should generally be resolved as the operation root for listing/traversal APIs.
  • followlinks should primarily control whether traversal follows child symlink directories encountered inside a tree.
  • is_file(followlinks=False) should treat symlinks as file-like.
  • open(link_to_file) should read the target content; open(link_to_dir) should raise IsADirectoryError.
  • unlink is file-only: directories should raise IsADirectoryError, while symlinks themselves can be unlinked.
  • Copy/sync/upload/download/rename symlink behavior needs a documented policy before changing defaults.

Remaining Issues

Local root symlink traversal

FSPath.scan() / smart_scan() and FSPath.walk() / smart_walk() do not match the expected root symlink-to-directory behavior:

  • scan(link_to_dir) currently treats the root symlink as a file-like path and yields the link itself.
  • walk(link_to_dir) currently returns empty when followlinks=False.
  • scandir(link_to_dir) and glob(link_to_dir/*) already expand the root symlink.

This makes local inconsistent with SFTP/SFTP2 and with the intended root-path semantics.

SFTP/SFTP2 copy(followlinks=False) on symlink

SFTP/SFTP2 currently materialize the symlink target content instead of preserving the symlink object. This differs from local FSPath.copy(..., followlinks=False), which delegates to shutil.copy2(..., follow_symlinks=False).

This is a behavior change if fixed, so it needs an explicit compatibility decision.

SFTP/SFTP2 download(followlinks=False) on symlink

SFTP download currently reads through backend file operations and materializes target content instead of creating a local symlink. SFTP2 likely has the same class of behavior, but coverage should be checked.

This differs from a strict “preserve link object when followlinks=False” interpretation.

SFTP/SFTP2 upload(followlinks=False) from local symlink

Upload from a local symlink generally materializes the target content. If followlinks=False is meant to preserve symlink objects, upload needs a different behavior or explicit documentation.

SFTP/SFTP2 sync(followlinks=True) traversal

sync(followlinks=True) passes followlinks to leaf copy(), but _sftp_scan_pairs() / _sftp2_scan_pairs() do not appear to use the same traversal flag. That means followlinks=True does not fully control the recursive copy set.

Fixing this may change which files are copied, so it should be handled carefully.

Cross-backend rename() on symlink

Cross-backend SFTP/SFTP2 rename on a symlink currently goes through generic copy/open behavior, materializes the target content at the destination, and removes the source link. This is surprising compared with same-backend/native rename, which moves the symlink object.

Changing this would alter default behavior. Possible policies include:

  • preserve current materialization behavior and document it,
  • reject cross-backend symlink rename,
  • preserve symlink objects when the destination backend can represent them.

md5(followlinks=...)

md5(followlinks=...) does not consistently honor the followlinks flag for symlink-vs-directory decisions across local/SFTP/SFTP2. The expected behavior probably should be:

  • symlink to file: hash target content when following links,
  • symlink to directory: either follow and apply directory hash semantics, or reject/document explicitly,
  • followlinks=False: define whether the link object itself is hashable or should raise.

Suggested Next Step

Handle this as a dedicated symlink semantics topic instead of mixing it into stat optimization PRs:

  1. Write the target behavior matrix for local, SFTP, and SFTP2.
  2. Add tests around root symlink traversal, child symlink traversal, copy/upload/download/sync, rename, unlink/remove, and md5.
  3. Split fixes into default-preserving bug fixes vs compatibility-sensitive behavior changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions