Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant performance improvement for recursive Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the path tracking mechanism in MATCH WHILE traversals by introducing an immutable cons-cell list, PathNode, to replace the previous ArrayList-based approach. This change aims to improve performance by making path appending an O(1) operation and enabling structural sharing. The PathNode is materialized into a List only when explicitly required. A review comment highlights a potential StackOverflowError in the recursive toList() implementation of PathNode for very deep traversal paths, suggesting an iterative approach for improved safety and performance.
core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/PathNode.java
Show resolved
Hide resolved
Test Count Gate Results✅ No baseline available yet — gate skipped (first run). |
Coverage Gate ResultsThresholds: 85% line, 70% branch Line Coverage: ✅ 100.0% (10/10 lines)
Branch Coverage: ✅ 100.0% (4/4 branches)
|
5e89ac0 to
910deee
Compare
JMH LDBC Benchmark ComparisonBase: Single-Thread Results
Multi-Thread Results
Scalability (MT/ST ratio)
|
|
Hi @sandrawar, please profile regressions using asyncprofiler on Hetzner CCX 33 node and find out what caused the regressions. |
910deee to
60af410
Compare
PR Title:
YTDB-631: Replace ArrayList path copies with immutable cons-cell list in
MATCH WHILE traversal
Motivation:
During recursive MATCH WHILE traversals (e.g., IS2 query following REPLY_OF chains), the engine copies the entire path ArrayList at every recursion level:
For a chain of depth D, this produces O(D²) element copies (1+2+3+...+D). Profiling showed 3,527 ArrayList allocation samples attributed to this pattern. With 20 messages each traversing 5-hop Comment→Post chains, that's ~300 unnecessary ArrayList allocations per query.
This PR introduces PathNode — an immutable cons-cell record that shares structure with ancestor paths:
Appending is O(1) instead of O(depth). Paths sharing a common prefix reuse the same node chain in memory. Materialization to List (via toList()) is deferred and only happens when the user declares a pathAlias —
which IS2 does not, so for IS2 the list is never built at all.
Risk is low: the change is contained within MatchEdgeTraverser.executeTraversal() and the PathNode is a simple,
stateless record. All existing MATCH WHILE tests (including pathAlias and diamond-graph dedup scenarios) pass unchanged.