Skip to content

YTDB-635: Index ordered match#880

Open
sandrawar wants to merge 30 commits intolazy-recursive-streamfrom
index-ordered-match
Open

YTDB-635: Index ordered match#880
sandrawar wants to merge 30 commits intolazy-recursive-streamfrom
index-ordered-match

Conversation

@sandrawar
Copy link
Copy Markdown
Collaborator

@sandrawar sandrawar commented Mar 27, 2026

design.md

PR Title:

YTDB-635: Index ordered match

Motivation:

MATCH queries with ORDER BY … LIMIT K on an edge target property currently load all edge targets into memory, sort them, and take the top K. For LDBC queries like IS2 (a Person's recent messages), this means loading all 500 messages to return the latest 20 — dominated by random I/O on records that are immediately discarded.

When a single-field index exists on the ORDER BY property, we can scan the index in sort order and use a bitmap filter (RidSet from the source's LinkBag) to skip non-matching entries at near-zero cost. For the IS2 case this reduces the work from 500 random record loads + in-memory sort to ~20 loads and zero sort. For multi-field ORDER BY (e.g., IC2: creationDate DESC, messageId ASC), the index scan provides primary-key ordering and a bounded heap with early termination handles the secondary sort (queries like IC2, IC7, IC8, IC9, IS3, IS7).

The optimization also supports multi-source queries (multiple source vertices) with four execution modes depending on whether the source has a WHERE filter and whether the source alias appears in RETURN. A cost model compares index scan vs load-all-and-sort and falls back transparently when the index scan is not profitable.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an index-ordered MATCH traversal optimization to improve query performance when results are sorted by a property on a target vertex. Key additions include IndexOrderedEdgeStep for optimized edge traversal, a cost-based heuristic model (IndexOrderedCostModel) to choose between index scans and in-memory sorting, and RidFilteredIndexValuesStep for efficient filtered index scans. The MatchExecutionPlanner was updated to detect these optimization opportunities and suppress or optimize OrderByStep accordingly. Review feedback suggests extracting a helper method for entity loading to reduce code duplication and replacing fragile string-based AST inspection with direct structure checks in the planner.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

Test Count Gate Results

✅ No baseline available yet — gate skipped (first run).

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

Coverage Gate Results

Thresholds: 85% line, 70% branch

Line Coverage: ✅ 86.5% (688/795 lines)

File Coverage Uncovered Lines
core/src/main/java/com/jetbrains/youtrackdb/api/config/GlobalConfiguration.java ✅ 100.0% (12/12) -
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/OrderByStep.java ✅ 87.0% (20/23) 139, 295, 297
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/RidFilteredIndexValuesStep.java ❌ 77.1% (27/35) 55, 88, 97-101, 103
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/IndexOrderedCostModel.java ✅ 97.0% (64/66) 61, 167
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/IndexOrderedEdgeStep.java ❌ 83.0% (331/399) 186, 192, 233, 262, 280-281, 322-323, 342, 367, 394, 424, 436, 452, 478, 491, 523, 527-533, 535-541, 544, 546-547, 549, 552-553, 557-558, 560, 564-565, 603, 677, 688, 743, 747, 757, 761, 779-780, 782, 797, 830, 834-836, 844, 864, 867-869, 894, 898, 901-902, 926, 928
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/MatchExecutionPlanner.java ✅ 90.4% (227/251) 614, 616, 2254, 2258, 2271, 2299, 2320, 2332, 2354, 2364, 2370, 2421, 2439, 2445, 2498, 2512, 2542, 2552, 2556, 2622, 2641, 2692, 2699-2700
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/resultset/ExecutionStream.java ❌ 77.8% (7/9) 102, 107

Branch Coverage: ✅ 70.6% (377/534 branches)

File Coverage Lines with Uncovered Branches
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/OrderByStep.java ✅ 75.0% (15/20) 137, 212, 214, 218, 294
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/RidFilteredIndexValuesStep.java ❌ 40.0% (4/10) 54, 87, 97-98
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/IndexOrderedCostModel.java ✅ 88.2% (30/34) 60, 65, 166, 184
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/IndexOrderedEdgeStep.java ❌ 68.4% (141/206) 152, 185, 191, 208, 210, 222, 254, 261, 278, 319, 321, 328, 341, 349, 366, 393, 399, 423, 435, 451, 477, 490, 531-532, 536-537, 540, 546, 552, 602, 609, 613, 676, 687, 691, 742, 744, 756, 760, 768, 773, 779, 796, 805, 829, 843, 863, 879, 891, 893, 897, 900-901
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/MatchExecutionPlanner.java ✅ 70.6% (185/262) 613-614, 620, 661, 2253, 2257, 2270, 2277, 2290-2291, 2298, 2304, 2312, 2319, 2329, 2332, 2335, 2353, 2363, 2369, 2374, 2420, 2438, 2444, 2453, 2455, 2463, 2468, 2497, 2508, 2517, 2541, 2545-2547, 2551, 2555, 2591, 2593, 2614, 2618-2619, 2640, 2649, 2653-2654, 2663, 2669, 2689-2691, 2696, 2698-2699
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/resultset/ExecutionStream.java ✅ 100.0% (2/2) -

@andrii0lomakin
Copy link
Copy Markdown
Collaborator

@sandrawar why do we have -301 tests on this branch ? That is huge decrease

@sandrawar sandrawar force-pushed the lazy-recursive-stream branch from d4507f8 to 0d6c1f8 Compare March 30, 2026 06:37
@sandrawar sandrawar force-pushed the index-ordered-match branch from d048143 to 2fc7b6c Compare March 30, 2026 06:46
@sandrawar
Copy link
Copy Markdown
Collaborator Author

@sandrawar why do we have -301 tests on this branch ? That is huge decrease

error in test count, fixed by rebase to develop

@sandrawar sandrawar force-pushed the lazy-recursive-stream branch from 0d6c1f8 to ca6ccc6 Compare April 2, 2026 06:54
@sandrawar sandrawar force-pushed the index-ordered-match branch from 6e4849a to 5f165ff Compare April 2, 2026 06:55
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

JMH LDBC Benchmark Comparison

Base: 244d8c7275 (fork-point with develop) | Head: 393e055914
Summary: 🔴 6 regression(s), 🟢 4 improvement(s) (>±5% threshold)

Single-Thread Results

Benchmark Base ops/s Base err Head ops/s Head err Δ%
ic10_friendRecommendation 0.146 ±4.8% 0.143 ±5.1% -1.6%
ic11_jobReferral 39.5 ±1.1% 34.1 ±1.9% -13.5% 🔴
ic12_expertSearch 23.9 ±2.2% 24.7 ±2.2% +3.2%
ic13_shortestPath 4,452 ±2.6% 4,426 ±2.7% -0.6%
ic1_transitiveFriends 42.1 ±1.3% 41.0 ±0.9% -2.5%
ic2_recentFriendMessages 239.9 ±1.9% 233.2 ±1.7% -2.8%
ic3_friendsInCountries 0.173 ±1.8% 0.175 ±2.9% +1.3%
ic4_newTopics 5.8 ±6.8% 3.2 ±9.2% -44.8% 🔴
ic5_newGroups 0.094 ±25.7% 0.095 ±24.2% +0.2%
ic6_tagCoOccurrence 3.9 ±3.0% 3.9 ±2.9% +1.1%
ic7_recentLikers 65.2 ±1.8% 60.8 ±1.6% -6.7% 🔴
ic8_recentReplies 978.4 ±0.9% 962.7 ±0.8% -1.6%
ic9_recentFofMessages 1.3 ±1.7% 1.3 ±1.6% -0.9%
is1_personProfile 56,532 ±1.8% 56,625 ±1.4% +0.2%
is2_personPosts 580.9 ±1.2% 565.0 ±0.8% -2.7%
is3_personFriends 15,925 ±1.6% 15,774 ±2.1% -1.0%
is4_messageContent 78,337 ±1.1% 78,140 ±0.9% -0.3%
is5_messageCreator 70,001 ±1.3% 70,918 ±2.0% +1.3%
is6_messageForum 48,487 ±1.5% 49,166 ±1.7% +1.4%
is7_messageReplies 2,803 ±1.1% 3,270 ±1.3% +16.6% 🟢

Multi-Thread Results

Benchmark Base ops/s Base err Head ops/s Head err Δ%
ic10_friendRecommendation 0.652 ±2.0% 0.631 ±1.9% -3.2%
ic11_jobReferral 183.7 ±7.5% 171.6 ±2.5% -6.6% 🔴
ic12_expertSearch 131.3 ±0.8% 132.6 ±1.0% +1.0%
ic13_shortestPath 22,361 ±2.7% 22,713 ±3.4% +1.6%
ic1_transitiveFriends 214.6 ±0.7% 210.7 ±0.8% -1.8%
ic2_recentFriendMessages 1,170 ±1.7% 1,136 ±1.8% -2.9%
ic3_friendsInCountries 0.727 ±2.0% 0.772 ±2.3% +6.2% 🟢
ic4_newTopics 23.9 ±2.2% 12.4 ±3.1% -48.1% 🔴
ic5_newGroups 0.411 ±4.7% 0.439 ±1.9% +6.9% 🟢
ic6_tagCoOccurrence 19.2 ±2.0% 19.5 ±1.3% +1.2%
ic7_recentLikers 307.6 ±1.9% 298.9 ±2.3% -2.8%
ic8_recentReplies 5,154 ±1.4% 5,064 ±1.0% -1.8%
ic9_recentFofMessages 6.9 ±2.6% 7.0 ±2.2% +1.4%
is1_personProfile 264,802 ±2.2% 263,953 ±2.6% -0.3%
is2_personPosts 3,013 ±1.3% 2,856 ±0.7% -5.2% 🔴
is3_personFriends 80,782 ±3.0% 79,287 ±3.1% -1.9%
is4_messageContent 364,028 ±1.3% 366,461 ±1.2% +0.7%
is5_messageCreator 329,919 ±1.5% 329,361 ±2.7% -0.2%
is6_messageForum 228,920 ±1.3% 228,420 ±1.5% -0.2%
is7_messageReplies 14,192 ±1.0% 17,042 ±0.9% +20.1% 🟢

Scalability (MT/ST ratio)

Benchmark Base ratio Head ratio Δ%
ic10_friendRecommendation 4.48x 4.41x -1.7%
ic11_jobReferral 4.66x 5.03x +8.0%
ic12_expertSearch 5.49x 5.37x -2.2%
ic13_shortestPath 5.02x 5.13x +2.2%
ic1_transitiveFriends 5.10x 5.13x +0.7%
ic2_recentFriendMessages 4.87x 4.87x -0.1%
ic3_friendsInCountries 4.19x 4.40x +4.9%
ic4_newTopics 4.15x 3.90x -6.0%
ic5_newGroups 4.35x 4.64x +6.7%
ic6_tagCoOccurrence 4.96x 4.97x +0.1%
ic7_recentLikers 4.72x 4.91x +4.2%
ic8_recentReplies 5.27x 5.26x -0.1%
ic9_recentFofMessages 5.27x 5.40x +2.3%
is1_personProfile 4.68x 4.66x -0.5%
is2_personPosts 5.19x 5.06x -2.5%
is3_personFriends 5.07x 5.03x -0.9%
is4_messageContent 4.65x 4.69x +0.9%
is5_messageCreator 4.71x 4.64x -1.5%
is6_messageForum 4.72x 4.65x -1.6%
is7_messageReplies 5.06x 5.21x +2.9%

@andrii0lomakin
Copy link
Copy Markdown
Collaborator

Hi @sandrawar, please profile regressions using asyncprofiler on Hetzner CCX 33 node and find out what caused the regressions.

@sandrawar sandrawar force-pushed the lazy-recursive-stream branch from ca6ccc6 to 083ce48 Compare April 3, 2026 11:57
@sandrawar sandrawar force-pushed the index-ordered-match branch from 5f165ff to 0aadc7f Compare April 3, 2026 12:12
@sandrawar sandrawar force-pushed the index-ordered-match branch from 0aadc7f to 7289036 Compare April 3, 2026 12:30
@sandrawar sandrawar force-pushed the index-ordered-match branch from 42135e6 to 393e055 Compare April 3, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants