Releases: structured-world/structured-zstd
Releases · structured-world/structured-zstd
v0.0.31
Added
- (wasm) streaming compress ZstdCompressStream + align decoder window cap to zstd default (#367)
- (wasm) simd128 kernel tier + npm package; LDM bench matrix variants (#364)
- (encode) configurable compression parameters API (#360)
Harden
- (decode) uniform FrameContentSizeMismatch on Compressed overshoot (#363)
v0.0.30
v0.0.29
Added
- (bench) report active CPU kernel tier + fix decode_loop corpus masking (#351)
Documentation
- (readme) fix btultra/btultra2 level boundary in strategy table (#334)
Fixed
- (decode) bound per-block output (decompression-bomb OOM) + L19 decode gap (#350)
Performance
- (codec) per-strategy block-split levels + per-tier exec-macro seq monolith (#354)
- (codec) warm dictionary reuse (decode + encode) + relocate decompression-bomb guard (#352)
- (encode) store row match positions as u32 to cut peak memory (#346)
- (encode) right-size dfast/row/hash-chain tables for small inputs (#345)
- (encode) price the custom FSE table without building it (#344)
- (encode) cut small-input fixed cost on the fast levels (#342)
- (encode) hoist hash/chain fill rebase guard out of insert loop (#340)
- (encode) rebase matcher floor on reset instead of zeroing tables (#338)
- (encode) donor-correct block-split + block-precise decode errors (#336)
- (encode) decouple and monomorphize the row matcher (#335)
- (encode) close level-5 greedy speed gap (row match-span cap) (#331)
- (encode) donor row-0 params for Fast negative levels (#330)
- (encode) size-gated dictionary match-finding for the Fast strategy (#328)
- (dict) greedy set-cover segment selection in fastcover trainer (#324)
v0.0.28
Added
- (decode) kernel_* cargo features for per-tier CPU kernel selection (#307)
Performance
- (encode) borrow one-shot input as Fast match window (#318)
- (encode) raw-pointer reads in Fast kernel backward extension (#321)
- (decode) route RLE-mode sequence tables through the fused path (#320)
- (encode) drop redundant per-match offset coding in Fast matcher (#319)
- (encode) dedup window<->history live-region storage (#317)
- (encode) map level 19 to btultra2 + align bench labels to clevels.h (#315)
- (encode) skip decoder-table build when loading a dict for encoding (#314)
- (decode) route bulk non-overlapping copies through memcpy (ERMS) (#309)
- (encode) vectorize the Row match-finder tag scan (#305)
- (encode) lower row-matcher min match to 5 (donor parity) (#310)
- (encode) align levels 13-15 to reference search budget (#302)
- (decode) unify SIMD-copy capability detection under the CpuKernel tag (#304)
Bench
- (dict) report Rust dict-compressed size + emit compress-dict ratio (#301)
v0.0.27
Performance
- (decode) expand overlapping matches by doubling, not offset chunks (#300)
- (decode) inline donor match-copy on all targets via portable wildcopy (#299)
- (decode) unroll HUF 4-stream burst inner loop via const-generic (#296)
- (decode) skip zero-init of literals target — HUF overwrites everything (#295)
- (fse) kill iterator overhead in build_decoding_table, write decode via set_len (#293)
- (decode) straight-loop short path + donor-gated lookahead ring + SeqSymbol repack (#289)
- (decode) skip post-decode XXH64 when content_checksum_flag is 0 (#287)
- (decode) per-tier x86 kernel split + AVX2 32-byte match-copy (#279 Phase 3+4) (#285)
- (decode) lazy ring-buffer allocation for direct-eligible frames (#282)
- (decode) drop dead seq.of from pipelined ring slots (#283)
Testing
v0.0.26
Added
- (encoding+decoding) FrameEmitInfo + opt-in per-block XXH64 sidecar (#272)
- (decoding) skippable-payload visitor callback on FrameDecoder (#271)
Documentation
Performance
- (decode) drop inline_never on repcode resolver, keep cold attr (#281)
- (encoding) HUF_flags_preferRepeat for fast strategies + small literals (#23 G6) (#278)
- (decoding) mirror donor ddictIsCold signal for pipelined dispatch (#274)
- (fse) rewrite build_decoding_table per donor ZSTD_buildFSETable_body shape (#276)
- (decode) route short-block fallback through inline executor (z000033 −25%) (#269)
- (decode) cache predefined FSE tables (small-4k-log-lines −69%) (#268)
- (decode) inline sequence executor for direct path + auto-route decode_all (z000033 −24%, high-entropy-1m parity) (#263)
- (bench) pre-touch decompress output Vec to kill page-fault artifact (#260)
- (decode) bump WILDCOPY_OVERLENGTH 16 → 32 for AVX2 chunked kernel reach (#261)
- (decode) RingBuffer % cap → branchless wrap helper (kills divl on i686, divq on x86_64) (#255)
- (decode) #247 Part 2 — kill divb in repeat_short_offset + force-inline UserSliceBackend::extend (#254)
- (decode) #247 Part 1 — expand FSE Entry to ZSTD_seqSymbol shape (#252)
Testing
- (bench) unblock dict-driven bench matrix + add pure_rust_with_dict compress arm (#277)
Harden
- (decode) fallible BufferBackend writes for RLE/Raw direct path (#251)
v0.0.25
v0.0.24
Performance
- (decode) pack HUF decode table as u16 (donor HUF_DEltX1 layout) (#243)
- (decode) collapse HUF 4-stream burst-gate to single-cursor olimit (#238)
- (encoder) align lazy-band target_len with donor clevels.h table[0] (#239)
- (fast) donor-parity hot-path cleanups in Fast kernel + inline short-literal append (#220 follow-up) (#231)
Testing
v0.0.23
Added
Performance
- (row) drop eager block-boundary inserts (#180) (#232)
- (decode) 8-slot software prefetch pipeline for sequence execution (#208) (#227)
- (encoder) port donor ZSTD_compressBlock_fast — 4-cursor + per-level cParams + cmov + window-correctness (#198 phase 3) (#229)
- (decoding) integrate AVX2 unroll-2 wildcopy candidate (#108) (#223)
- (encoder) wire donor-shape Fast kernel into MatchGeneratorDriver (#198 phase 1b) (#217)
- (encoder) donor-shape Fast kernel modules (#198 phase 1a) (#215)
- (fse) elide bounds check on init_state + update_state decode reads (#214)
- (decode) SIMD-16 fast path for short offsets {1, 2, 4} (#213)
- (decode) const-generic HUF kernel monomorphisation for SIMD-fallback (#212)
- (decode) port donor HUF 4-stream burst with sentinel-bit ctz (#201)
Testing
- (hc) cross-slice boundary position seeding regression test (#235)
v0.0.22
Added
- (encoder) strategy-aware literal gates (G4 + G5) (#182)
- (bench) #99 Rust↔FFI sequence-stream comparator (#149)
Documentation
- bump quick-start dep version to 0.0.21 + drop legacy ruzstd Changelog.md (#155)
Performance
- (decode) pack LL/ML metadata + hot-path micro-opts (#197)
- (decode) fused sequence executor + SIMD/FSE hot-path cleanup + DoS-safe rollback (#194)
- (encoder) align Fast strategy window_log with donor (17 → 19) (#187)
- (encoder) inline hash-chain walk into hash_chain_candidate (lazy L1) (#185)
- (encoder) G3 — whole-block bail-out before partition split (#181)
- (encoder) donor-parity greedy parse at L4 — ratio + speed win (#179)
- (huff0) cache encoded weight-description bytes on
HuffmanTableand reuse in emit path (#170) - (huff0) #167 cheap entropy proxy for table_log selection — no FSE-encode per candidate (#168)
- (fse) donor FSE_buildCTable_wksp parity — drop per-symbol Vec (#166)
- (fse) replace next_state linear search with donor-parity flat tables + tune CI bench budgets (#165)