[codex] Optimize primitive arg layout fast paths#845
Draft
MilkBlock wants to merge 3 commits intoegraphs-good:mainfrom
Draft
[codex] Optimize primitive arg layout fast paths#845MilkBlock wants to merge 3 commits intoegraphs-good:mainfrom
MilkBlock wants to merge 3 commits intoegraphs-good:mainfrom
Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
5bb2cb4 to
ed766e0
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #845 +/- ##
==========================================
+ Coverage 81.54% 86.29% +4.74%
==========================================
Files 88 85 -3
Lines 24112 23373 -739
==========================================
+ Hits 19662 20169 +507
+ Misses 4450 3204 -1246 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
a52475d to
98f72a4
Compare
98f72a4 to
7336b22
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR combines two tightly-related pieces needed to make the primitive-argument layout optimization meaningful on real workloads.
core-relationsfor the single-Externalshape: when an action batch executes exactly one external call whose arguments are a consecutive variable window in the current struct-of-arrays bindings layout, the runtime batch-transposes once into a row-major scratch buffer and then invokes the external function row-by-row from that transposed block.core-relations::RuleBuilder::build_with_description(): for eligible single-external rules, external argument vars are normalized into a consecutive core-variable window while preserving their semantic order. This increases the hit rate of the runtime fast path in realistic workloads instead of relying on lucky pre-existing variable numbering.The optimization is still intentionally scoped:
Focused validation performed:
cargo test -q -p egglog-core-relations run_instrs_single_external_borrowed_window_executes -- --nocapturecargo test -q -p egglog-core-relations single_external_rule_normalizes_gap_vars_into_consecutive_window -- --nocapturecargo test -q -p egglog-bridge constrain_prims -- --nocapturecargo check -qDownstream workload validation:
math_microbenchmarkbefore/after on the corresponding downstream dependency setup.rewrites run_ruleset:1.605385708 s->1.503743250 s(~6.3% faster)3.005323708 s->2.813726958 s(~6.4% faster)That is the reason these two changes are folded together here: the runtime transpose lane alone is too narrow to justify a broader performance claim, while the query-entry normalization makes the same lane actually trigger in the real single-external rule shapes we care about.