Context
PR #6650 added Operation::Update.updated_fragment_offsets so partial column-rewrite commits can refresh _row_last_updated_at_version only for matched rows. Followed by PR #6748 exposing this on the Java API.
In Rust the offsets are stored as HashMap<u64, RoaringBitmap>, which is space-efficient. The wire/proto encoding however is currently:
// transaction.proto
message Update {
// ...
map<uint64, UInt32List> updated_fragment_offsets = 9;
}
message UInt32List {
// sorted distinct local physical row offsets within the fragment (0-based)
repeated uint32 values = 1;
}
i.e. the in-memory bitmap is materialized as a flat Vec<u32> (and on the Java side a long[]) for transport.
Problem
For dense full-fragment column rewrites — the common case for column-level migrations or any whole-table update_columns_with_offsets invocation — the wire-size is unbounded with respect to row count:
Concrete example from a dataset (1.6B rows, ~18,880 fragments, avg ~86k rows/fragment, all rows in every fragment match):
| encoding |
bytes / fragment |
bytes total |
| repeated uint32 packed varint (status quo) |
~242 KB |
~4.5 GB |
| serialized RoaringBitmap, portable format |
~30-300 B (single dense run container) |
~5 MB |
A ~4.5 GB transaction manifest is impractical in several places:
- Lance commit lock has a finite hold time; writing/reading multi-GB manifests pushes us against it.
- pylance / lance-jni readers materialize the full
Vec<u32> / long[] in memory before constructing the HashMap<u64, RoaringBitmap> — peak memory is O(total_matched_rows × 8 bytes), not O(serialized_bitmap_bytes).
- Sparse cases (e.g. an operation that touched 10M rows out of 1.6B) work fine, so the bug is invisible until someone hits a dense-rewrite use case.
Proposed change
Drop the flat UInt32List encoding and transmit only the serialized RoaringBitmap on the wire. The existing Rust in-memory representation (HashMap<u64, RoaringBitmap>) already maps 1:1 to this; current code paths spend the round-trip CPU/memory expanding it to/from Vec<u32> for no functional benefit.
message Update {
// ...
// Deprecated: writers stop populating; readers fall through to field 10.
map<uint64, UInt32List> updated_fragment_offsets = 9 [deprecated = true];
// Preferred. Value is the portable RoaringBitmap serialisation
// (https://github.com/RoaringBitmap/CRoaring#format-specification),
// i.e. `RoaringBitmap::serialize_into` / `deserialize_from`.
map<uint64, bytes> updated_fragment_offsets_roaring = 10;
}
Rust write path:
let bytes: Vec<u8> = {
let mut buf = Vec::with_capacity(bitmap.serialized_size_in_bytes());
bitmap.serialize_into(&mut buf)?;
buf
};
proto.updated_fragment_offsets_roaring.insert(frag_id, bytes);
Read path:
RoaringBitmap::deserialize_from(&bytes[..])?
RoaringBitmap is already a dependency of the crate after PR #6650 — no new third-party additions.
Context
PR #6650 added
Operation::Update.updated_fragment_offsetsso partial column-rewrite commits can refresh_row_last_updated_at_versiononly for matched rows. Followed by PR #6748 exposing this on the Java API.In Rust the offsets are stored as
HashMap<u64, RoaringBitmap>,which is space-efficient. The wire/proto encoding however is currently:i.e. the in-memory bitmap is materialized as a flat
Vec<u32>(and on the Java side along[]) for transport.Problem
For dense full-fragment column rewrites — the common case for column-level migrations or any whole-table
update_columns_with_offsetsinvocation — the wire-size is unbounded with respect to row count:Concrete example from a dataset (1.6B rows, ~18,880 fragments, avg ~86k rows/fragment, all rows in every fragment match):
A ~4.5 GB transaction manifest is impractical in several places:
Vec<u32>/long[]in memory before constructing theHashMap<u64, RoaringBitmap>— peak memory isO(total_matched_rows × 8 bytes), notO(serialized_bitmap_bytes).Proposed change
Drop the flat
UInt32Listencoding and transmit only the serialized RoaringBitmap on the wire. The existing Rust in-memory representation (HashMap<u64, RoaringBitmap>) already maps 1:1 to this; current code paths spend the round-trip CPU/memory expanding it to/fromVec<u32>for no functional benefit.Rust write path:
Read path:
RoaringBitmapis already a dependency of the crate after PR #6650 — no new third-party additions.