Skip to content

Commit 7281998

Browse files
hailangxHaiyang XuCopilotCopilotCI Fix
authored
Add filter support for VSIM vector search (#1570)
* Add post-filter support for VSIM vector search results Implement JSON-path-based filter expressions that are evaluated against vector element attributes after similarity search. The filter engine includes a tokenizer, expression parser, and evaluator supporting comparison operators, logical operators (and/or/not), arithmetic, string equality, containment (in), and parenthesized grouping. Integrate post-filtering into VectorManager for both VSIM code paths, rejecting requests that specify a filter without WITHATTRIBS. * fix format * Update libs/server/Resp/Vector/VectorManager.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Avoid per-result byte array allocation in EvaluateFilter (#1571) * Initial plan * Avoid per-result allocation in EvaluateFilter by using Utf8JsonReader with ParseValue Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com> * VSIM FILTER works without WITHATTRIBS by fetching attributes internally (#1572) * Initial plan * Fetch attributes internally for filtering when not returning them Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com> * optimize code * add Supported vector filter syntax * update doc with syntac * fix build * update test with ELE style syntax * split the filter engine tests * remove object value type * remove object-returning property * fix format error * resove comments * refactor to stack-based postfix * remove hot path allocate * Fix formatting: remove trailing newlines per editorconfig Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * optimize allocate * fix test * reusable evaluation stack with default capacity (16) * use stack with default 16 capacity to minimize the allocation as most of the filters should just fit in * add more tests for attribute extractor * add benchmark * optimize the allocate * orgnize the benchmarks and ordered by real-world frequency * refactor * avoid memory copy for final results output * refactor to all use slicing for josn extraction * Single-pass extraction for all fields * refactor to all use ReadOnlySpan<byte> * removing the filter folder * refactor to no heap allocation * All buffers are stackalloc'd on the thread stack * clean up unused methods * fix format * use maxFilteringEffort for filtering * use ArrayPool instead of stackalloc * update the design doc * update design doc * use scratchBufferBuilder * move ScratchBufferBuilder up --------- Co-authored-by: Haiyang Xu <haixu@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: CI Fix <fix@ci.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 1b695ef commit 7281998

19 files changed

+4703
-43
lines changed

benchmark/BDN.benchmark/Filter/FilterExpressionBenchmarks.cs

Lines changed: 551 additions & 0 deletions
Large diffs are not rendered by default.

libs/server/API/GarnetApi.cs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -520,12 +520,12 @@ public unsafe GarnetStatus VectorSetRemove(ArgSlice key, ArgSlice element)
520520
=> storageSession.VectorSetRemove(SpanByte.FromPinnedPointer(key.ptr, key.length), SpanByte.FromPinnedPointer(element.ptr, element.length));
521521

522522
/// <inheritdoc />
523-
public unsafe GarnetStatus VectorSetValueSimilarity(ArgSlice key, VectorValueType valueType, ArgSlice values, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result)
524-
=> storageSession.VectorSetValueSimilarity(SpanByte.FromPinnedPointer(key.ptr, key.length), valueType, values, count, delta, searchExplorationFactor, filter.ReadOnlySpan, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result);
523+
public unsafe GarnetStatus VectorSetValueSimilarity(ArgSlice key, VectorValueType valueType, ArgSlice values, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result, ref SpanByteAndMemory filterBitmap)
524+
=> storageSession.VectorSetValueSimilarity(SpanByte.FromPinnedPointer(key.ptr, key.length), valueType, values, count, delta, searchExplorationFactor, filter.ReadOnlySpan, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result, ref filterBitmap);
525525

526526
/// <inheritdoc />
527-
public unsafe GarnetStatus VectorSetElementSimilarity(ArgSlice key, ArgSlice element, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result)
528-
=> storageSession.VectorSetElementSimilarity(SpanByte.FromPinnedPointer(key.ptr, key.length), element.ReadOnlySpan, count, delta, searchExplorationFactor, filter.ReadOnlySpan, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result);
527+
public unsafe GarnetStatus VectorSetElementSimilarity(ArgSlice key, ArgSlice element, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result, ref SpanByteAndMemory filterBitmap)
528+
=> storageSession.VectorSetElementSimilarity(SpanByte.FromPinnedPointer(key.ptr, key.length), element.ReadOnlySpan, count, delta, searchExplorationFactor, filter.ReadOnlySpan, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result, ref filterBitmap);
529529

530530
/// <inheritdoc/>
531531
public unsafe GarnetStatus VectorSetEmbedding(ArgSlice key, ArgSlice element, ref SpanByteAndMemory outputDistances)

libs/server/API/GarnetWatchApi.cs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -650,17 +650,17 @@ public bool ResetScratchBuffer(int offset)
650650

651651
#region Vector Sets
652652
/// <inheritdoc/>
653-
public GarnetStatus VectorSetValueSimilarity(ArgSlice key, VectorValueType valueType, ArgSlice value, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result)
653+
public GarnetStatus VectorSetValueSimilarity(ArgSlice key, VectorValueType valueType, ArgSlice value, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result, ref SpanByteAndMemory filterBitmap)
654654
{
655655
garnetApi.WATCH(key, StoreType.Main);
656-
return garnetApi.VectorSetValueSimilarity(key, valueType, value, count, delta, searchExplorationFactor, filter, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result);
656+
return garnetApi.VectorSetValueSimilarity(key, valueType, value, count, delta, searchExplorationFactor, filter, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result, ref filterBitmap);
657657
}
658658

659659
/// <inheritdoc/>
660-
public GarnetStatus VectorSetElementSimilarity(ArgSlice key, ArgSlice element, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result)
660+
public GarnetStatus VectorSetElementSimilarity(ArgSlice key, ArgSlice element, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result, ref SpanByteAndMemory filterBitmap)
661661
{
662662
garnetApi.WATCH(key, StoreType.Main);
663-
return garnetApi.VectorSetElementSimilarity(key, element, count, delta, searchExplorationFactor, filter, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result);
663+
return garnetApi.VectorSetElementSimilarity(key, element, count, delta, searchExplorationFactor, filter, maxFilteringEffort, includeAttributes, ref outputIds, out outputIdFormat, ref outputDistances, ref outputAttributes, out result, ref filterBitmap);
664664
}
665665

666666
/// <inheritdoc/>

libs/server/API/IGarnetApi.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2041,15 +2041,15 @@ public bool IterateObjectStore<TScanFunctions>(ref TScanFunctions scanFunctions,
20412041
/// Ids are encoded in <paramref name="outputIds"/> as length prefixed blobs of bytes.
20422042
/// Attributes are encoded in <paramref name="outputAttributes"/> as length prefixed blobs of bytes.
20432043
/// </summary>
2044-
GarnetStatus VectorSetValueSimilarity(ArgSlice key, VectorValueType valueType, ArgSlice value, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result);
2044+
GarnetStatus VectorSetValueSimilarity(ArgSlice key, VectorValueType valueType, ArgSlice value, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result, ref SpanByteAndMemory filterBitmap);
20452045

20462046
/// <summary>
20472047
/// Perform a similarity search given an element already in the vector set and these parameters.
20482048
///
20492049
/// Ids are encoded in <paramref name="outputIds"/> as length prefixed blobs of bytes.
20502050
/// Attributes are encoded in <paramref name="outputAttributes"/> as length prefixed blobs of bytes.
20512051
/// </summary>
2052-
GarnetStatus VectorSetElementSimilarity(ArgSlice key, ArgSlice element, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result);
2052+
GarnetStatus VectorSetElementSimilarity(ArgSlice key, ArgSlice element, int count, float delta, int searchExplorationFactor, ArgSlice filter, int maxFilteringEffort, bool includeAttributes, ref SpanByteAndMemory outputIds, out VectorIdFormat outputIdFormat, ref SpanByteAndMemory outputDistances, ref SpanByteAndMemory outputAttributes, out VectorManagerResult result, ref SpanByteAndMemory filterBitmap);
20532053

20542054
/// <summary>
20552055
/// Fetch the embedding of a given element in a Vector set.

0 commit comments

Comments
 (0)