Cursor-based sequential iteration for Index trait#105
Open
antiguru wants to merge 22 commits intofrankmcsherry:masterfrom
Open
Cursor-based sequential iteration for Index trait#105antiguru wants to merge 22 commits intofrankmcsherry:masterfrom
antiguru wants to merge 22 commits intofrankmcsherry:masterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This reverts commit 3682ffb.
Add type Cursor<'a>: Iterator and fn cursor(Range) to the Index trait. Provide DefaultCursor fallback and impl_default_cursor! macro. Primitive slice/vec impls use core::slice::Iter for bounds-check-free iteration. Slice delegates to inner container's cursor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Generate TupleCursorN structs that zip field cursors. Each tuple's cursor() composes field cursors, enabling specialized iteration (e.g. rank-free Repeats) through composite types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RepeatsCursor maintains somes_cursor incrementally — one bit test per element, no rank() calls after the initial seed. LookbacksCursor does the same for oks_cursor, resolving Err back-references without rank(). Reference-to-owned impls (&'a Repeats, &'a Lookbacks, &'a tuple, &'a Slice) use DefaultCursor since their cursor can't outlive intermediate borrows. The hot path (borrowed forms) uses the owned-generic impls with specialized cursors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cache the current u64 word from the bitvector so only one fetch per 64 elements instead of per element. Eliminates 1 bounds check + 1 branch (values vs tail) per next() call, replacing them with a branch predicted taken 63/64 times. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
antiguru
added a commit
to antiguru/materialize
that referenced
this pull request
Apr 16, 2026
Picks up Index::cursor support (frankmcsherry/columnar#105) needed for efficient Repeats iteration in factorized columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
antiguru
added a commit
to antiguru/materialize
that referenced
this pull request
Apr 17, 2026
Picks up Index::cursor support (frankmcsherry/columnar#105) needed for efficient Repeats iteration in factorized columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
antiguru
added a commit
to antiguru/materialize
that referenced
this pull request
Apr 20, 2026
Picks up Index::cursor support (frankmcsherry/columnar#105) needed for efficient Repeats iteration in factorized columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
antiguru
added a commit
to antiguru/materialize
that referenced
this pull request
Apr 22, 2026
Picks up Index::cursor support (frankmcsherry/columnar#105) needed for efficient Repeats iteration in factorized columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Generate a per-type cursor struct that zips field cursors, matching the tuple macro pattern. Struct fields with specialized cursors (e.g. Repeats) are now reached through derived containers. Bench (100k rows, Repeats key + Vec val): - derived get(): 1,210 µs (unchanged, baseline) - derived cursor (before): 1,146 µs (DefaultCursor — same cost) - derived cursor (after): 99 µs (11.6x faster) - tuple cursor: 110 µs (reference point) Ref form (&'columnar #c_ident) keeps DefaultCursor — same borrow issue as &'a tuple. Enums still use DefaultCursor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
index_iter() now returns Self::Cursor<'_> instead of IterOwn<&Self>. Callers get the specialized cursor path automatically — Repeats / Lookbacks / derived structs / tuples compose rank-free or pointer-based iteration without changing call sites. cursor_iter() removed as redundant; migrate to index_iter(). Options round-trip tests adjusted: method resolution now picks Options::index_iter (owned form) via auto-deref, yielding owned Option<T> items instead of Option<&T> via the &Options impl. Test comparisons updated accordingly. into_index_iter(self) / IterOwn<Self> kept unchanged — owned iteration still uses the slow get() path; uncommon hot path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Derived cursor gains size_hint and ExactSizeIterator (matching the tuple cursor for consistent use with .len() / .zip() callers). - Word-boundary test coverage: cursor_matches_get now probes indices near 64, 128, 192, 256 to catch regressions in word caching. - Debug asserts guard against underflow in the None/Err branches (relies on container invariants: Repeats starts with Some, Lookbacks starts with Ok). Known non-fixes documented in PR review: - &'a Repeats / &'a Lookbacks / &'a tuple / &'a derived container still use DefaultCursor: lifetime gymnastics in the ref form are deferred. Hot path goes through container.borrow().index_iter() which hits the owned-form specialized cursor. - into_index_iter(self) keeps IterOwn slow path (owned consumption is uncommon). - index_iter() return type change is a breaking API change; will need a version bump when released. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
columnar 0.12's index_iter signature (IterOwn<&Self>) required &&[T]: Index for the borrowed-tuple case, which wasn't implemented. Callers had to use into_index_iter() consuming the Copy Borrowed. With cursor-based index_iter, .borrow().index_iter() compiles and uses the composed cursor path — no &&[T]: Index bound needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Owned iteration had no specialization path — returned IterOwn<Self> which always used the slow get()-based path. Callers consumed self unnecessarily; auto-ref to ref-form gave the same slow result as index_iter(). Removing the method forces callers to either borrow (container.borrow().index_iter() — fast composed cursor) or keep using IterOwn directly via Slice::into_iter for the one in-tree use case (Vec<T>::Columnar impls). IterOwn is preserved as an implementation detail of Slice::into_iter, which is widely used by derived Columnar impls. Migration: - columns.into_index_iter() -> columns.borrow().index_iter() - Where Container: Index bounds don't hold (e.g. non-Copy field containers), borrow() produces a Container with all-Copy inner refs that satisfies Index. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
100k elements, iter-then-sum: - repeats_get 1094µs - repeats_cursor 92µs (11.8x) - lookbacks_get 1123µs - lookbacks_cursor 121µs (9.2x) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explain why Ref is deliberately not a GAT (borrowed containers return references at their own lifetime, outliving the &self borrow) while Cursor<'a> is a GAT (iteration state is intrinsically &self-bound). Note the composition principle behind tuple / derived-struct cursors, and the ref-form DefaultCursor fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- bytes.rs: unresolved intra-doc link to `indexed` (needs full path) - adts/art.rs: bare URL needs angle brackets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parallel to ContainerOf<T>, BorrowedOf<'a, T>, Ref<'a, T>. Resolves to the cursor type of the borrowed container: <BorrowedOf<'a, T> as Index>::Cursor<'a>. Useful for naming iterator types in trait associated-type slots, e.g. timely's DrainContainer::DrainIter<'a>. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
`///` doc comments inside `quote!` are literal — `#names` did not substitute, so the generated rustdoc read literally "Container for #names." Switch to `#[doc = ...]` with pre-formatted strings. Fixes container_struct, reference_struct, and cursor_struct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Callers with only the container in scope (no T: Columnar) can't use the T-based alias. Index is the fundamental abstraction; parameterize the alias on any C: Index. Columnar users compose with BorrowedOf: CursorOf<'a, BorrowedOf<'a, Foo>>. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cursor's lifetime is bound to the container's shell, so the common pattern `self.borrow().index_iter()` creates a cursor borrowing from the temporary Borrowed shell. For callers that need to return an iterator whose lifetime is tied to the borrowed value's inner refs (e.g. timely's DrainContainer), into_index_iter + IterOwn gives them an owned iterator that consumes the Copy Borrowed shell — slow path but semantically correct. Revert callers in src/main.rs and benches/ops.rs to original form. Update CHANGELOG accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Repeats<TC>::get(index)callsRankSelect::rank(index)on every access — a popcount over multipleu64words.For sequential iteration this recomputes bit counts from scratch per element.
The same overhead hits
Lookbacks.For composite types (tuples,
#[derive(Columnar)]structs) containing these, the cost multiplies.Design
Extend
Indexwith a GAT cursor:DefaultCursor(wrapsget()).core::slice::Iterfor pointer-based iteration.TupleCursorN<A::Cursor, B::Cursor, ...>/FooContainerCursor<'a, ...>— that zip field cursors. Specialized cursors propagate through composites automatically.RepeatsCursor/LookbacksCursorseedsomes_cursor/oks_cursorwith onerank()at range start, then maintain incrementally: bit test + conditional increment per element.&[u64].CursorOf<'a, C>type alias (<C as Index>::Cursor<'a>) for naming iterator types in trait associated-type slots.Trade-offs
&'a Repeats,&'a Lookbacks,&'a tuple,&'a Slice, derive ref form) useDefaultCursor. Composed cursors need lifetime gymnastics around intermediate references that don't work through a generic impl. The hot path —container.borrow().index_iter()— hits the owned-form specialized cursor.index_iterreturn type changes fromIterOwn<&Self>toSelf::Cursor<'_>— semver-major. Callers iterating withfor x in c.index_iter()keep working; callers namingIterOwn<&Self>break. Item types can shift from&TtoTdepending on method dispatch (reflected in updated sum tests).container.borrow().index_iter()pattern has a lifetime limitation. The cursor borrows from the Borrowed shell, which is a temporary in that expression. Callers who need to return an iterator from a method where the Borrowed can't be stashed (e.g.fn drain(&mut self) -> Self::DrainIter<'_>) should useinto_index_iter— it consumes the Copy Borrowed shell, giving anIterOwn<Borrowed<'a>>whose lifetime ties to the inner refs. Slow path, but semantically correct. A cleaner fix (cursors holding Copy inner refs directly) would require either specialization or dropping owned-form cursors; out of scope for this PR.DefaultCursor. Composed dispatch across variants with per-element discriminants is non-trivial; not worth the surface-area bump for this PR.Performance
All numbers from benches in this PR. Machine-dependent; relative improvements are the signal.
Repeats/Lookbacksdirect iteration (100k elements,benches/repeats_cursor.rs):repeats_getrepeats_cursorlookbacks_getlookbacks_cursorBoth types eliminate per-element
rank(); remaining cost is bit shift + one bounds-checked array access intosomes/oks.Derived struct with a
Repeatsfield (100k rows,benches/derived_cursor.rs):derived_get(per-elem)derived_cursor(composed)tuple_cursor(reference)derived_cursormatchestuple_cursor— the composed cursor lets derived structs benefit identically to hand-written tuples.Options/Results (100k elements,
benches/options_cursor.rs):options_getoptions_cursorresults_getresults_cursorOptions/Results see no change when iterated directly:
get()cost is dominated by thesomes/oksarray access, notrank().They still benefit indirectly when they appear inside composites — the
Repeats/Lookbacksfields get specialized iteration via the composed cursor.Profile after word caching (Materialize workload):
rank()per cursor construction, zero per element.somes.get()bounds-checked array access.What changed
Indextrait:CursorGAT,cursor(),index_iterunified to use it.DefaultCursor+impl_default_cursor!macro; ~35 container impls adopt default cursor.RepeatsCursor+LookbacksCursorwith word caching.TupleCursorNstructs.columnar_derivegeneratesFooContainerCursor<'a, ...>for structs (enums keep default).CursorOf<'a, C>type alias.into_index_iterkept as slow-path escape hatch for consuming iteration.🤖 Generated with Claude Code