Skip to content

feat(vector)!: add approx mode for RaBitQ search#7179

Merged
BubbleCal merged 9 commits into
mainfrom
yang/approx-mode-rabitq
Jun 10, 2026
Merged

feat(vector)!: add approx mode for RaBitQ search#7179
BubbleCal merged 9 commits into
mainfrom
yang/approx-mode-rabitq

Conversation

@BubbleCal

@BubbleCal BubbleCal commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Feature

This PR adds a public approx_mode setting for vector search with fast, normal, and accurate values. The public API avoids exposing RaBitQ/HACC terminology while still allowing callers to choose the speed/accuracy tradeoff when the backing index supports it.

Implementation

  • Adds ApproxMode to vector queries and threads it through Rust scanner, ANN proto serialization, Python query parsing, and FlatIndex distance calculator options.
  • Implements RaBitQ behavior for each mode:
    • fast: force 1-bit query-time scoring for RaBitQ.
    • normal: preserve the existing search path.
    • accurate: use a 16-bit LUT accumulation path for the binary RaBitQ estimator.
  • Extends query scratch with a wider accumulation buffer while keeping the existing QueryScratchCapacity::new(...) API compatible.
  • Adds Python API support via approx_mode="fast" | "normal" | "accurate" and rejects invalid values.

Benchmark

IVF_RQ on dbpedia, num_bits=5, no refine_factor. The plot shows approx_mode=fast, normal, and accurate as separate curves.

IVF_RQ dbpedia approx_mode benchmark, num_bits=5

Breaking Change

BREAKING CHANGE: The ANN protobuf schema now includes VectorApproxMode approx_mode on vector query serialization. Consumers that regenerate or explicitly match Lance's serialized ANN query proto should update to the new schema.

Validation

  • cargo fmt --all
  • cargo test -p lance-linalg simd::dist_table
  • cargo test -p lance-index vector::bq::storage
  • cargo test -p lance-index vector::storage::tests
  • cargo test -p lance --features substrait test_query_roundtrip
  • cargo test -p lance test_knn_approx_mode_defaults_and_setter
  • cargo check --workspace --tests --benches
  • cd python && make install
  • cd python && uv run pytest python/tests/test_vector_index.py::test_vector_index_with_approx_mode python/tests/test_vector_index.py::test_vector_index_invalid_approx_mode
  • cd python && uv run pytest python/tests/test_vector_index.py::test_create_ivf_rq_multi_bit_searches_l2_and_cosine
  • git diff --check

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

This PR touches the Lance format specification.

Substantive changes to the format specification — the .proto definitions
and the spec docs under docs/src/format/ — require a PMC vote before merge.
Minor edits such as typo fixes, wording, or formatting are excluded; use your
judgment.

If this is a meaningful format change:

  • Start a vote following the Lance community voting process.
    Format specification modifications need 3 binding +1 votes (excluding the
    proposer), held on GitHub Discussions, with a minimum voting period of 1 week.
  • Once the vote passes, link the completed vote in this PR. It should not be
    merged until the vote is linked.

@github-actions github-actions Bot added enhancement New feature or request A-python Python bindings A-index Vector index, linalg, tokenizer A-format On-disk format: protos and format spec docs and removed enhancement New feature or request labels Jun 9, 2026
@github-actions github-actions Bot added the enhancement New feature or request label Jun 9, 2026
@github-actions github-actions Bot added the A-java Java bindings + JNI label Jun 9, 2026
@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

@BubbleCal BubbleCal marked this pull request as ready for review June 9, 2026 13:38
Comment thread protos/ann.proto
Comment thread protos/ann.proto
Comment thread protos/ann.proto
@BubbleCal BubbleCal changed the title feat(vector): add approx mode for RaBitQ search feat(vector)!: add approx mode for RaBitQ search Jun 9, 2026
@BubbleCal BubbleCal requested a review from Xuanwo June 9, 2026 15:35
@BubbleCal BubbleCal merged commit e256207 into main Jun 10, 2026
30 checks passed
@BubbleCal BubbleCal deleted the yang/approx-mode-rabitq branch June 10, 2026 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-format On-disk format: protos and format spec docs A-index Vector index, linalg, tokenizer A-java Java bindings + JNI A-python Python bindings breaking-change enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants