docs: add comprehensive binary encoding format documentation#899
Conversation
Add detailed specifications for components critical to pure JavaScript implementations: - encoding-xxhash32.md: Complete xxHash32 algorithm with JS code and test vectors - encoding-lz4.md: LZ4 Frame format specification with decompression algorithm - encoding-container-states.md: Container state snapshot formats for Map, List, Text (Richtext), Tree, and MovableList containers Update encoding.md with: - Links to supplementary documentation in table of contents - Cross-references in relevant sections (checksum, state encoding, compression) - New "Supplementary Documentation" section - Extended implementation checklist with container state items
Address two critical documentation gaps:
1. ContainerType postcard serde mapping:
- Document that postcard serialization uses a DIFFERENT historical mapping
than ContainerID.to_bytes() (e.g., Text=0 vs Text=2)
- This affects Option<ContainerID> decoding in ContainerWrapper.parent
- Added comparison table in both encoding.md and encoding-container-states.md
2. Op prop field semantics:
- Document how the `prop` column value is computed per container/op type
- List/Text: position for Insert/Delete, start for StyleStart
- MovableList: position for Insert/Delete, target for Move
- Map: key index into keys arena
- Tree: always 0
These were blocking issues for implementing a complete decoder.
WASM Size Report
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a795a48fbf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Empty with LORO seed | ||
| xxHash32(new Uint8Array([]), LORO_SEED) === 0x30CFEAB0 // 819588784 |
There was a problem hiding this comment.
Correct LORO-seed xxHash32 test vectors
The LORO-seed vectors listed here don’t match the algorithm defined just above (and the xxHash32 implementation used by Loro). If you run the provided JS algorithm with LORO_SEED = 0x4F524F4C, the empty-input hash comes out as 0xDC3BF95A, not 0x30CFEAB0 (and the other LORO-seed vectors in this block are similarly off). This inconsistency will cause implementers to think their encoder/decoder is wrong even when it matches Loro’s checksum behavior.
Useful? React with 👍 / 👎.
Add detailed documentation for Loro's binary encoding format including:
Mark incomplete sections with checkboxes for future expansion.
This enables developers to implement Loro-compatible encoders/decoders
in other programming languages.