Skip to content

Roadmap: coordinode-lsm-tree performance & structured-zstd integration #215

@polaz

Description

@polaz

Status: V5 released (v5.0.0 to v5.2.1)

The full V5 integrity stack shipped on crates.io. v5.0.0 (PR #272) landed the format-changing batch (manifest hardening, per-KV protection, per-block seqno bounds); v5.1.0 (PR #432) and the v5.2.x line followed with post-V5 perf, integrity, and no-std work. The four-axis block model (compression x encryption x ECC) is live and round-trips across all 8 combinations.

Cross-cutting design discipline still binds every format-changing PR:

Open: Encryption / ECC track

Three independent, orthogonal axes (compression, encryption, ECC), each on/off; BlockTransform models all 8 combinations and every combination must round-trip.

Encryption axis = AAD-bound, LIVE (done #413 via PR #423). The live block path seals through AAD-bound encrypt_block/decrypt_block; block-swap / cross-table / codec-relabel / epoch+suite downgrade are caught (AEAD-verify failure). AAD binds table_id + codec context (compression_type, dict_id, window_log) + block_flags; block byte offset and owning tree id are deliberately NOT bound. Wire format is exactly MetadataFrame || BodyFrame.

ECC axis = Page ECC (#267 / crate::ecc / BlockTransform), configurable scheme (XOR / Reed-Solomon, done #254 via PR #414), opt-in / off by default. Correction is over the on-disk compressed+encrypted bytes, verified by the block's XXH3-128 checksum before decompress. The SEC-DED single-bit fast path (done #255 via PR #437) heals 1-bit flips ahead of RS recovery. Auto-heal (done #411 via PR #446) reschedules a compaction rewrite after a parity-corrected read; patrol scrub (done #412 via PR #456) sweeps for latent errors proactively.

Order Issue What Blocked by Est
E2 🟡 #253 test: AAD threat-model regression suite. Runs against the live AAD path (done #413); inner-frame dict-id gate landed (PR #441). Remaining: dict-content substitution, decompression-bomb window swap, codec-version drift (need structured-zstd integration). done #413, done #251 2d
E6 🟡 #256 forensics CLI: dump encrypted block structure without key. dump-block + reconstruct-AAD shipped (PR #444 / #445); remaining: inner-block listing for non-encrypted, Page ECC trailer report, mixed-suite integration corpus. done #251, done #413 1d
E7 🟡 #257 partial-decode for range queries (query-path; independent of ECC). Adaptive partial-tier + promotion + true resume shipped via PR #415, reaches parity. Opt-in behind LSM_PARTIAL_DECODE; open pending default-enable decision / remaining hardening. done structured-zstd resume API 2d

Open: Critical Path (format / API contract)

Order Issue What Type Blocked by Est
CP-C 🟡 #144 perf: compaction parallelism. Phase 1 (parallel block compression, PR #401) + Phase 2 (sub-compaction split, PR #402) implemented. The headline subcompaction gap (#410) was a bench artifact (PR #427); honest baseline shows ours competitive. Residual ~1.27x range-split is the remaining tuning target. Performance, no on-disk break tuning #410 tuning open
CP-D #274 no-std + alloc migration: full crate. Core engine now compiles + runs on the no-std-only thumbv7em-none-eabihf target (--features alloc, 0 errors) over caller-injected Fs/Clock traits (done #449). Remaining: vendor byteview strategy (+ SIMD opportunity) and the broader surface. API surface (cfg-gated) core landed via done #449 epic

Open: no-std track

Issue What Note
#358 gate the manifest+version+tree+checkpoint std-bound layer Superseded by #449: the chosen strategy un-gates the engine onto traits rather than gating leaf modules (gating cascaded into 1200+ unresolved-module errors). Core no-std landed; close-candidate.
#346 io_uring backend on raw Linux syscalls (no_std + alloc) Consumer of #274, needs the no-std trait surface. Pure-Rust io_uring without tokio-uring / std-bound io-uring.

Open: Performance backlog

Issue What Note
#410 subcompaction far slower than RocksDB (3x-13x in compare-rocksdb) Headline gap was a bench artifact: RocksDB's manual compaction skips the bottommost rewrite (kIfHaveCompactionFilter) while ours forces it. Honest L3 + 4-thread baseline (PR #427) shows ours winning major_compact; only the ~1.27x range-split subcompaction is a real target. Gates the #144 epic.
#426 bench: blob_tree (KV-separation) vs surrealkv + attribute read-gap source Diagnostic bench, no format/API change.

Cross-cutting design discipline

Issue What Status
#353 Benchmark Symmetry Invariant: every new format feature OFF-by-default OR match a RocksDB baseline OR explicit OFF mode; compare-rocksdb ships a RocksDbParity preset RocksDbParity / LsmTreeDefault / LsmTreeParanoid presets (via LSM_BENCH_PRESET) + docs/BENCHMARKING.md + CONTRIBUTING cross-link implemented (PR pending). Binds all format-changing PRs.

Recently merged (since last actualization)

Release

no-std

Encryption / ECC

V5 format batch (all landed)

Integrity / verification

Manifest / recovery

Query / compaction / perf

FS / SIMD / build

Tooling

Metadata

Metadata

Assignees

No one assigned

    Labels

    ciCI/CD workflows, GitHub Actions, release automationcompactionCompaction logic, leveled/tiered strategydocumentationImprovements or additions to documentationencryptionEncryption subsystem, AES-GCM, key managementenhancementNew feature, new API, new capabilityfs-traitFilesystem abstraction, io_uring, per-level routingperformanceOptimization, reduced allocations, faster pathrefactorCode restructuring without behavior changetestTests, test infrastructure, test helpers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions