Skip to content

Releases: structured-world/structured-zstd

v0.0.31

07 Jun 19:18
8ef222d

Choose a tag to compare

Added

  • (wasm) streaming compress ZstdCompressStream + align decoder window cap to zstd default (#367)
  • (wasm) simd128 kernel tier + npm package; LDM bench matrix variants (#364)
  • (encode) configurable compression parameters API (#360)

Harden

  • (decode) uniform FrameContentSizeMismatch on Compressed overshoot (#363)

v0.0.30

07 Jun 01:07
50058fe

Choose a tag to compare

Added

  • (decoding) block-subset partial decode + per-block decompressed byte ranges (#357)

Performance

  • (encode) inline SIMD literal copy for medium runs (#355)

v0.0.29

06 Jun 15:26
9673a4b

Choose a tag to compare

Added

  • (bench) report active CPU kernel tier + fix decode_loop corpus masking (#351)

Documentation

  • (readme) fix btultra/btultra2 level boundary in strategy table (#334)

Fixed

  • (decode) bound per-block output (decompression-bomb OOM) + L19 decode gap (#350)

Performance

  • (codec) per-strategy block-split levels + per-tier exec-macro seq monolith (#354)
  • (codec) warm dictionary reuse (decode + encode) + relocate decompression-bomb guard (#352)
  • (encode) store row match positions as u32 to cut peak memory (#346)
  • (encode) right-size dfast/row/hash-chain tables for small inputs (#345)
  • (encode) price the custom FSE table without building it (#344)
  • (encode) cut small-input fixed cost on the fast levels (#342)
  • (encode) hoist hash/chain fill rebase guard out of insert loop (#340)
  • (encode) rebase matcher floor on reset instead of zeroing tables (#338)
  • (encode) donor-correct block-split + block-precise decode errors (#336)
  • (encode) decouple and monomorphize the row matcher (#335)
  • (encode) close level-5 greedy speed gap (row match-span cap) (#331)
  • (encode) donor row-0 params for Fast negative levels (#330)
  • (encode) size-gated dictionary match-finding for the Fast strategy (#328)
  • (dict) greedy set-cover segment selection in fastcover trainer (#324)

v0.0.28

02 Jun 16:48
662181a

Choose a tag to compare

Added

  • (decode) kernel_* cargo features for per-tier CPU kernel selection (#307)

Performance

  • (encode) borrow one-shot input as Fast match window (#318)
  • (encode) raw-pointer reads in Fast kernel backward extension (#321)
  • (decode) route RLE-mode sequence tables through the fused path (#320)
  • (encode) drop redundant per-match offset coding in Fast matcher (#319)
  • (encode) dedup window<->history live-region storage (#317)
  • (encode) map level 19 to btultra2 + align bench labels to clevels.h (#315)
  • (encode) skip decoder-table build when loading a dict for encoding (#314)
  • (decode) route bulk non-overlapping copies through memcpy (ERMS) (#309)
  • (encode) vectorize the Row match-finder tag scan (#305)
  • (encode) lower row-matcher min match to 5 (donor parity) (#310)
  • (encode) align levels 13-15 to reference search budget (#302)
  • (decode) unify SIMD-copy capability detection under the CpuKernel tag (#304)

Bench

  • (dict) report Rust dict-compressed size + emit compress-dict ratio (#301)

v0.0.27

01 Jun 02:15
d412d55

Choose a tag to compare

Performance

  • (decode) expand overlapping matches by doubling, not offset chunks (#300)
  • (decode) inline donor match-copy on all targets via portable wildcopy (#299)
  • (decode) unroll HUF 4-stream burst inner loop via const-generic (#296)
  • (decode) skip zero-init of literals target — HUF overwrites everything (#295)
  • (fse) kill iterator overhead in build_decoding_table, write decode via set_len (#293)
  • (decode) straight-loop short path + donor-gated lookahead ring + SeqSymbol repack (#289)
  • (decode) skip post-decode XXH64 when content_checksum_flag is 0 (#287)
  • (decode) per-tier x86 kernel split + AVX2 32-byte match-copy (#279 Phase 3+4) (#285)
  • (decode) lazy ring-buffer allocation for direct-eligible frames (#282)
  • (decode) drop dead seq.of from pipelined ring slots (#283)

Testing

  • (bench) add encode_loop_z000033 example for clean encoder profiles (#297)
  • (decode) decode_loop binary — add --mode ffi and --corpus path (#288)

v0.0.26

27 May 10:54
9e6b3ad

Choose a tag to compare

Added

  • (encoding+decoding) FrameEmitInfo + opt-in per-block XXH64 sidecar (#272)
  • (decoding) skippable-payload visitor callback on FrameDecoder (#271)

Documentation

  • (#176) Skippable Frame Magic Allocations registry (#270)

Performance

  • (decode) drop inline_never on repcode resolver, keep cold attr (#281)
  • (encoding) HUF_flags_preferRepeat for fast strategies + small literals (#23 G6) (#278)
  • (decoding) mirror donor ddictIsCold signal for pipelined dispatch (#274)
  • (fse) rewrite build_decoding_table per donor ZSTD_buildFSETable_body shape (#276)
  • (decode) route short-block fallback through inline executor (z000033 −25%) (#269)
  • (decode) cache predefined FSE tables (small-4k-log-lines −69%) (#268)
  • (decode) inline sequence executor for direct path + auto-route decode_all (z000033 −24%, high-entropy-1m parity) (#263)
  • (bench) pre-touch decompress output Vec to kill page-fault artifact (#260)
  • (decode) bump WILDCOPY_OVERLENGTH 16 → 32 for AVX2 chunked kernel reach (#261)
  • (decode) RingBuffer % cap → branchless wrap helper (kills divl on i686, divq on x86_64) (#255)
  • (decode) #247 Part 2 — kill divb in repeat_short_offset + force-inline UserSliceBackend::extend (#254)
  • (decode) #247 Part 1 — expand FSE Entry to ZSTD_seqSymbol shape (#252)

Testing

  • (bench) unblock dict-driven bench matrix + add pure_rust_with_dict compress arm (#277)

Harden

  • (decode) fallible BufferBackend writes for RLE/Raw direct path (#251)

v0.0.25

24 May 20:25
08424d2

Choose a tag to compare

Added

  • (skippable) typed SkippableFrame API behind lsm feature (#248)
  • (decode) expect_dict_id + expect_window_descriptor setters on FrameDecoder (#249)

Performance

  • (decode) direct-write decode_to_slice path (#244) (#245)

v0.0.24

24 May 03:13
a667cdc

Choose a tag to compare

Performance

  • (decode) pack HUF decode table as u16 (donor HUF_DEltX1 layout) (#243)
  • (decode) collapse HUF 4-stream burst-gate to single-cursor olimit (#238)
  • (encoder) align lazy-band target_len with donor clevels.h table[0] (#239)
  • (fast) donor-parity hot-path cleanups in Fast kernel + inline short-literal append (#220 follow-up) (#231)

Testing

  • (encoding) donor-parity comparator for block splitter port (#240)
  • (bench) add decompress-dict group to compare_ffi (#236)

v0.0.23

23 May 18:23
00b6de1

Choose a tag to compare

Added

  • (frame) magicless frame format support (#26) (#222)

Performance

  • (row) drop eager block-boundary inserts (#180) (#232)
  • (decode) 8-slot software prefetch pipeline for sequence execution (#208) (#227)
  • (encoder) port donor ZSTD_compressBlock_fast — 4-cursor + per-level cParams + cmov + window-correctness (#198 phase 3) (#229)
  • (decoding) integrate AVX2 unroll-2 wildcopy candidate (#108) (#223)
  • (encoder) wire donor-shape Fast kernel into MatchGeneratorDriver (#198 phase 1b) (#217)
  • (encoder) donor-shape Fast kernel modules (#198 phase 1a) (#215)
  • (fse) elide bounds check on init_state + update_state decode reads (#214)
  • (decode) SIMD-16 fast path for short offsets {1, 2, 4} (#213)
  • (decode) const-generic HUF kernel monomorphisation for SIMD-fallback (#212)
  • (decode) port donor HUF 4-stream burst with sentinel-bit ctz (#201)

Testing

  • (hc) cross-slice boundary position seeding regression test (#235)

v0.0.22

19 May 22:50
297fcad

Choose a tag to compare

Added

  • (encoder) strategy-aware literal gates (G4 + G5) (#182)
  • (bench) #99 Rust↔FFI sequence-stream comparator (#149)

Documentation

  • bump quick-start dep version to 0.0.21 + drop legacy ruzstd Changelog.md (#155)

Performance

  • (decode) pack LL/ML metadata + hot-path micro-opts (#197)
  • (decode) fused sequence executor + SIMD/FSE hot-path cleanup + DoS-safe rollback (#194)
  • (encoder) align Fast strategy window_log with donor (17 → 19) (#187)
  • (encoder) inline hash-chain walk into hash_chain_candidate (lazy L1) (#185)
  • (encoder) G3 — whole-block bail-out before partition split (#181)
  • (encoder) donor-parity greedy parse at L4 — ratio + speed win (#179)
  • (huff0) cache encoded weight-description bytes on HuffmanTable and reuse in emit path (#170)
  • (huff0) #167 cheap entropy proxy for table_log selection — no FSE-encode per candidate (#168)
  • (fse) donor FSE_buildCTable_wksp parity — drop per-symbol Vec (#166)
  • (fse) replace next_state linear search with donor-parity flat tables + tune CI bench budgets (#165)