Releases · structured-world/structured-zstd · GitHub

07 Jun 19:18

v0.0.31

Added

(wasm) streaming compress ZstdCompressStream + align decoder window cap to zstd default (#367)
(wasm) simd128 kernel tier + npm package; LDM bench matrix variants (#364)
(encode) configurable compression parameters API (#360)

Harden

(decode) uniform FrameContentSizeMismatch on Compressed overshoot (#363)

Assets 2

07 Jun 01:07

v0.0.30

Added

(decoding) block-subset partial decode + per-block decompressed byte ranges (#357)

Performance

(encode) inline SIMD literal copy for medium runs (#355)

Assets 2

06 Jun 15:26

v0.0.29

Added

(bench) report active CPU kernel tier + fix decode_loop corpus masking (#351)

Documentation

(readme) fix btultra/btultra2 level boundary in strategy table (#334)

Fixed

(decode) bound per-block output (decompression-bomb OOM) + L19 decode gap (#350)

Performance

(codec) per-strategy block-split levels + per-tier exec-macro seq monolith (#354)
(codec) warm dictionary reuse (decode + encode) + relocate decompression-bomb guard (#352)
(encode) store row match positions as u32 to cut peak memory (#346)
(encode) right-size dfast/row/hash-chain tables for small inputs (#345)
(encode) price the custom FSE table without building it (#344)
(encode) cut small-input fixed cost on the fast levels (#342)
(encode) hoist hash/chain fill rebase guard out of insert loop (#340)
(encode) rebase matcher floor on reset instead of zeroing tables (#338)
(encode) donor-correct block-split + block-precise decode errors (#336)
(encode) decouple and monomorphize the row matcher (#335)
(encode) close level-5 greedy speed gap (row match-span cap) (#331)
(encode) donor row-0 params for Fast negative levels (#330)
(encode) size-gated dictionary match-finding for the Fast strategy (#328)
(dict) greedy set-cover segment selection in fastcover trainer (#324)

Assets 2

02 Jun 16:48

v0.0.28

Added

(decode) kernel_* cargo features for per-tier CPU kernel selection (#307)

Performance

(encode) borrow one-shot input as Fast match window (#318)
(encode) raw-pointer reads in Fast kernel backward extension (#321)
(decode) route RLE-mode sequence tables through the fused path (#320)
(encode) drop redundant per-match offset coding in Fast matcher (#319)
(encode) dedup window<->history live-region storage (#317)
(encode) map level 19 to btultra2 + align bench labels to clevels.h (#315)
(encode) skip decoder-table build when loading a dict for encoding (#314)
(decode) route bulk non-overlapping copies through memcpy (ERMS) (#309)
(encode) vectorize the Row match-finder tag scan (#305)
(encode) lower row-matcher min match to 5 (donor parity) (#310)
(encode) align levels 13-15 to reference search budget (#302)
(decode) unify SIMD-copy capability detection under the CpuKernel tag (#304)

Bench

(dict) report Rust dict-compressed size + emit compress-dict ratio (#301)

Assets 2

01 Jun 02:15

v0.0.27

Performance

(decode) expand overlapping matches by doubling, not offset chunks (#300)
(decode) inline donor match-copy on all targets via portable wildcopy (#299)
(decode) unroll HUF 4-stream burst inner loop via const-generic (#296)
(decode) skip zero-init of literals target — HUF overwrites everything (#295)
(fse) kill iterator overhead in build_decoding_table, write decode via set_len (#293)
(decode) straight-loop short path + donor-gated lookahead ring + SeqSymbol repack (#289)
(decode) skip post-decode XXH64 when content_checksum_flag is 0 (#287)
(decode) per-tier x86 kernel split + AVX2 32-byte match-copy (#279 Phase 3+4) (#285)
(decode) lazy ring-buffer allocation for direct-eligible frames (#282)
(decode) drop dead seq.of from pipelined ring slots (#283)

Testing

(bench) add encode_loop_z000033 example for clean encoder profiles (#297)
(decode) decode_loop binary — add --mode ffi and --corpus path (#288)

Assets 2

27 May 10:54

v0.0.26

Added

(encoding+decoding) FrameEmitInfo + opt-in per-block XXH64 sidecar (#272)
(decoding) skippable-payload visitor callback on FrameDecoder (#271)

Documentation

(#176) Skippable Frame Magic Allocations registry (#270)

Performance

(decode) drop inline_never on repcode resolver, keep cold attr (#281)
(encoding) HUF_flags_preferRepeat for fast strategies + small literals (#23 G6) (#278)
(decoding) mirror donor ddictIsCold signal for pipelined dispatch (#274)
(fse) rewrite build_decoding_table per donor ZSTD_buildFSETable_body shape (#276)
(decode) route short-block fallback through inline executor (z000033 −25%) (#269)
(decode) cache predefined FSE tables (small-4k-log-lines −69%) (#268)
(decode) inline sequence executor for direct path + auto-route decode_all (z000033 −24%, high-entropy-1m parity) (#263)
(bench) pre-touch decompress output Vec to kill page-fault artifact (#260)
(decode) bump WILDCOPY_OVERLENGTH 16 → 32 for AVX2 chunked kernel reach (#261)
(decode) RingBuffer % cap → branchless wrap helper (kills divl on i686, divq on x86_64) (#255)
(decode) #247 Part 2 — kill divb in repeat_short_offset + force-inline UserSliceBackend::extend (#254)
(decode) #247 Part 1 — expand FSE Entry to ZSTD_seqSymbol shape (#252)

Testing

(bench) unblock dict-driven bench matrix + add pure_rust_with_dict compress arm (#277)

Harden

(decode) fallible BufferBackend writes for RLE/Raw direct path (#251)

Assets 2

24 May 20:25

v0.0.25

Added

(skippable) typed SkippableFrame API behind lsm feature (#248)
(decode) expect_dict_id + expect_window_descriptor setters on FrameDecoder (#249)

Performance

(decode) direct-write decode_to_slice path (#244) (#245)

Assets 2

24 May 03:13

v0.0.24

Performance

(decode) pack HUF decode table as u16 (donor HUF_DEltX1 layout) (#243)
(decode) collapse HUF 4-stream burst-gate to single-cursor olimit (#238)
(encoder) align lazy-band target_len with donor clevels.h table[0] (#239)
(fast) donor-parity hot-path cleanups in Fast kernel + inline short-literal append (#220 follow-up) (#231)

Testing

(encoding) donor-parity comparator for block splitter port (#240)
(bench) add decompress-dict group to compare_ffi (#236)

Assets 2

23 May 18:23

v0.0.23

Added

(frame) magicless frame format support (#26) (#222)

Performance

(row) drop eager block-boundary inserts (#180) (#232)
(decode) 8-slot software prefetch pipeline for sequence execution (#208) (#227)
(encoder) port donor ZSTD_compressBlock_fast — 4-cursor + per-level cParams + cmov + window-correctness (#198 phase 3) (#229)
(decoding) integrate AVX2 unroll-2 wildcopy candidate (#108) (#223)
(encoder) wire donor-shape Fast kernel into MatchGeneratorDriver (#198 phase 1b) (#217)
(encoder) donor-shape Fast kernel modules (#198 phase 1a) (#215)
(fse) elide bounds check on init_state + update_state decode reads (#214)
(decode) SIMD-16 fast path for short offsets {1, 2, 4} (#213)
(decode) const-generic HUF kernel monomorphisation for SIMD-fallback (#212)
(decode) port donor HUF 4-stream burst with sentinel-bit ctz (#201)

Testing

(hc) cross-slice boundary position seeding regression test (#235)

Assets 2

19 May 22:50

v0.0.22

Added

(encoder) strategy-aware literal gates (G4 + G5) (#182)
(bench) #99 Rust↔FFI sequence-stream comparator (#149)

Documentation

bump quick-start dep version to 0.0.21 + drop legacy ruzstd Changelog.md (#155)

Performance

(decode) pack LL/ML metadata + hot-path micro-opts (#197)
(decode) fused sequence executor + SIMD/FSE hot-path cleanup + DoS-safe rollback (#194)
(encoder) align Fast strategy window_log with donor (17 → 19) (#187)
(encoder) inline hash-chain walk into hash_chain_candidate (lazy L1) (#185)
(encoder) G3 — whole-block bail-out before partition split (#181)
(encoder) donor-parity greedy parse at L4 — ratio + speed win (#179)
(huff0) cache encoded weight-description bytes on HuffmanTable and reuse in emit path (#170)
(huff0) #167 cheap entropy proxy for table_log selection — no FSE-encode per candidate (#168)
(fse) donor FSE_buildCTable_wksp parity — drop per-symbol Vec (#166)
(fse) replace next_state linear search with donor-parity flat tables + tune CI bench budgets (#165)

Assets 2