Skip to content

feat: vendor russh-sftp with serde_bytes perf fix#188

Open
Yaminyam wants to merge 1 commit intolablup:mainfrom
Yaminyam:feat/vendor-russh-sftp
Open

feat: vendor russh-sftp with serde_bytes perf fix#188
Yaminyam wants to merge 1 commit intolablup:mainfrom
Yaminyam:feat/vendor-russh-sftp

Conversation

@Yaminyam
Copy link
Copy Markdown
Member

Summary

  • Vendor russh-sftp into crates/bssh-russh-sftp following the same pattern as the existing crates/bssh-russh (sync-upstream.sh + create-patch.sh + patches/).
  • Apply a serde_bytes annotation to protocol::Write::data and protocol::Data::data, plus a wire-compatible serialize_bytes impl in ser.rs. This is the only functional change versus upstream v2.1.1.
  • Switch the top-level russh-sftp dependency to use the vendored package. All use russh_sftp::... imports in bssh continue to work unchanged via package = "bssh-russh-sftp".

Motivation

perf record on the server side of a 1 GiB SFTP upload against unmodified bssh-server shows:

42.11%  serde_core::de::impls::<impl Deserialize for Vec<T>>::deserialize::VecVisitor::visit_seq
 3.59%  aws_lc_0_39_1_CRYPTO_poly1305_update
 3.14%  aws_lc_0_39_1_ChaCha20_ctr32_avx2

Crypto is not the bottleneck. The Write and Data SFTP packets each carry data: Vec<u8>, which with #[derive(Deserialize)] dispatches through serde's generic deserialize_seq — reading the SFTP payload one byte at a time via VecVisitor. For a 1 GiB upload that is ~10^9 visitor dispatches.

Annotating the fields with #[serde(with = "serde_bytes")] routes Vec<u8> through deserialize_byte_buf, which the crate's own Deserializer already implements as a single try_get_bytes (u32 length + bulk copy_to_bytes). Symmetrically, Serializer::serialize_bytes — previously a BadMessage error — is implemented with the same u32 len + bytes framing the existing deserialize_seq path was producing, so the wire format is unchanged.

Measured impact

1 GiB SFTP upload, OpenSSH client → bssh-server on a CPU-bound host (Xeon Silver 4214, 5 runs each):

Binary Avg throughput
upstream russh-sftp 2.1.1 74.8 MiB/s
bssh-russh-sftp (this PR) 96.4 MiB/s

OpenSSH sftp-server on the same host measures ~101 MiB/s, so the gap shrinks from ~26% to ~5%.

On a host that is already near network-bound (AMD EPYC 7742, 1 Gbps internal) both unpatched and patched runs sit at ~95 MiB/s — the fix is invisible when CPU is not the limiter, as expected.

Why vendor, not a git dependency

Upstream russh-sftp has had no commits since the v2.1.1 bump on 2025-04-18. Two related performance issues are unresolved: #55 (closed, no root cause), #70 (open). Mirroring the bssh-russh vendoring pattern keeps the two forks consistent and lets us ship independently of upstream.

If/when upstream merges an equivalent fix this crate can be deleted in one commit.

Test plan

  • cargo fmt --check
  • cargo clippy --workspace -- -D warnings
  • cargo check -p bssh-russh-sftp
  • cargo check -p bssh
  • End-to-end: sftp put 1 GiB file against bssh-server running the patched binary on two different hosts (network-bound and CPU-bound), wire format verified compatible with stock OpenSSH client
  • CI smoke tests (cargo test --lib, cargo test --tests --skip integration_test)

🤖 Generated with Claude Code

Add `crates/bssh-russh-sftp`, a temporary fork of upstream `russh-sftp`
following the same pattern as the existing `crates/bssh-russh`.

The only functional change versus upstream v2.1.1 is a
`#[serde(with = "serde_bytes")]` annotation on `protocol::Write::data`
and `protocol::Data::data`, plus a wire-compatible `serialize_bytes`
implementation in `ser.rs`. Without it, `#[derive(Deserialize)]` for
`Vec<u8>` dispatches to `deserialize_seq` and parses the SFTP payload
one byte at a time — `perf` shows ~42% of server CPU in
`VecVisitor::visit_seq` during 1 GiB uploads. The annotation routes
through the existing bulk `try_get_bytes` path in the crate's own
Deserializer, which is already implemented.

Measured impact on a CPU-bound host (Xeon Silver 4214) with an OpenSSH
client performing a 1 GiB SFTP upload:

  - upstream russh-sftp 2.1.1: 74.8 MiB/s
  - this fork:                 96.4 MiB/s  (+29%)

OpenSSH `sftp-server` on the same host measures ~101 MiB/s, so the gap
narrows from ~26% to ~5%.

Upstream russh-sftp has had no commits since its v2.1.1 bump
(2025-04-18) and two download-perf issues (lablup#55 closed without root
cause, lablup#70 open) sit unanswered, so the fix lives here until upstream
activity resumes. `sync-upstream.sh` and `create-patch.sh` mirror the
`bssh-russh` tooling so future syncs are mechanical.

The top-level dependency is switched from
`russh-sftp = "2.1.1"` to
`russh-sftp = { package = "bssh-russh-sftp", version = "2.1.1", path = "crates/bssh-russh-sftp" }`
so every `use russh_sftp::...` import in bssh continues to work
unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant