Rust rule benchmarking tableaction cache by MilkBlock · Pull Request #830 · egraphs-good/egglog

MilkBlock · 2026-03-17T09:55:15Z

cache TableAction for every RustRuleContext so that we could avoid copy every rule run.

…ion-cache

MilkBlock · 2026-03-17T09:56:08Z

Please review it after rust_rule_benchmarking merge.

codecov-commenter · 2026-03-17T10:02:46Z

Codecov Report

❌ Patch coverage is 68.53933% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.99%. Comparing base (f6cff9b) to head (7de2394).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
src/prelude.rs	68.53%	28 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #830      +/-   ##
==========================================
- Coverage   82.07%   81.99%   -0.08%     
==========================================
  Files          88       88              
  Lines       24709    24780      +71     
==========================================
+ Hits        20279    20319      +40     
- Misses       4430     4461      +31

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codspeed-hq · 2026-03-17T10:08:44Z

Merging this PR will improve performance by ×21

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 7 improved benchmarks
✅ 31 untouched benchmarks
⏩ 190 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	WallTime	`rust_rule_match_overhead[rule_run_1]`	21 ms	14.6 ms	+43.11%
⚡	WallTime	`rust_rule_tableaction_hot_path[facts50000_funcs200]`	627.9 ms	34.2 ms	×18
⚡	WallTime	`rust_rule_match_with_serialize[rule_run_1]`	303.7 ms	286.4 ms	+6.03%
⚡	WallTime	`rust_rule_insert_loop[ops1000_funcs200]`	474.1 µs	411.3 µs	+15.28%
⚡	Simulation	`rust_rule_tableaction_hot_path[facts50000_funcs200]`	872.8 ms	41.6 ms	×21
⚡	Simulation	`rust_rule_match_overhead[rule_run_1]`	32.4 ms	23.2 ms	+39.23%
⚡	Simulation	`rust_rule_insert_loop[ops1000_funcs200]`	713 µs	623.7 µs	+14.32%

_{Comparing MilkBlock:rust-rule-benchmarking-tableaction-cache (7de2394) with main (f6cff9b)}

190 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

…arking-tableaction-cache' into rust-rule-benchmarking-tableaction-cache

…ion-cache

MilkBlock · 2026-03-18T03:28:15Z

@oflatt

Copilot

Pull request overview

This PR optimizes Rust-backed rule execution by avoiding per-match cloning of all TableActions, instead caching shared TableActions by backend FunctionId and cloning only when mutation-scratch is required.

Changes:

Replace HashMap<String, TableAction> cloning per apply with an Arc<Vec<Option<TableAction>>> indexed by FunctionId.
Add internal name→FunctionId mapping and expose new lookup_id/insert_id/remove_id/subsume_id APIs.
Introduce a per-callback (per-match) lazy cache for mutable TableAction clones used by insert/subsume.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-29T23:42:50Z

+pub struct RustRuleContext<'a, 'b, 'c> {
    exec_state: &'a mut ExecutionState<'b>,
    union_action: egglog_bridge::UnionAction,
-    table_actions: HashMap<String, egglog_bridge::TableAction>,
+    /// Name -> backend FunctionId (captured at rust-rule registration time).
+    func_ids: &'c HashMap<String, egglog_bridge::FunctionId>,
+    /// Table actions indexed by backend FunctionId (captured at rust-rule registration time).
+    table_actions_by_id: &'c [Option<egglog_bridge::TableAction>],
+    /// Per-match table actions, lazily populated from `table_actions_by_id` (for scratch reuse).
+    table_actions_cache_by_id: Vec<(egglog_bridge::FunctionId, egglog_bridge::TableAction)>,


RustRuleContext is a public type and this change adds a new lifetime parameter ('c). That’s a breaking API change for downstream crates that spell the type as RustRuleContext<'a, 'b>. Consider avoiding the extra lifetime by tying func_ids/table_actions_by_id to the existing borrow lifetime (e.g. use the same lifetime as exec_state) or by storing Arc<_> in the context instead of references.

Copilot · 2026-03-29T23:42:50Z

    /// Do a table lookup. This is potentially a mutable operation!
    /// For more information, see `egglog_bridge::TableAction::lookup`.
    pub fn lookup(&mut self, table: &str, key: &[Value]) -> Option<Value> {
-        self.get_table_action(table).lookup(self.exec_state, key)
+        let table_id = self.func_id(table);
+        self.lookup_id(table_id, key)
+    }
+
+    /// Do a table lookup by cached backend FunctionId.
+    /// For more information, see `egglog_bridge::TableAction::lookup`.
+    pub fn lookup_id(&mut self, table: egglog_bridge::FunctionId, key: &[Value]) -> Option<Value> {
+        let RustRuleContext {
+            exec_state,
+            table_actions_by_id,
+            ..
+        } = self;
+        Self::get_table_action_by_id(table_actions_by_id, table).lookup(exec_state, key)
    }


The new public *_id methods take egglog_bridge::FunctionId, but the Rust API doesn’t appear to expose any way for users to obtain/cache a FunctionId (e.g., Function.backend_id is private and func_id here is private). Either expose a stable way to get the FunctionId (from RustRuleContext and/or EGraph) or keep these *_id methods non-public to avoid adding an effectively unusable public API surface.

Copilot · 2026-03-29T23:42:50Z

+        table: egglog_bridge::FunctionId,
+    ) -> &egglog_bridge::TableAction {
+        table_actions_by_id
+            .get(table.rep() as usize)


Indexing the per-function vector via table.rep() as usize works for the current define_id! implementation, but NumericId already provides index() specifically for dense indexing. Using table.index() here (and in other indexing sites) makes the intent clearer and avoids coupling this code to Rep being a small integer type.

Suggested change

.get(table.rep() as usize)

.get(table.index())

Copilot · 2026-03-29T23:42:51Z

+    let mut table_actions_by_id: Vec<Option<egglog_bridge::TableAction>> = Vec::new();
+    for (k, v) in egraph.functions.iter() {
+        func_ids.insert(k.clone(), v.backend_id);
+        let idx = v.backend_id.rep() as usize;


Same as above: v.backend_id.rep() as usize is being used as a dense index. Prefer v.backend_id.index() (from NumericId) for clarity and to avoid relying on Rep layout.

Suggested change

let idx = v.backend_id.rep() as usize;

let idx = v.backend_id.index();

yihozhang · 2026-03-29T23:45:18Z

    let prim_name = egraph.parser.symbol_gen.fresh("rust_rule_prim");
    let panic_id = egraph.backend.new_panic(format!("{prim_name}_panic"));
+
+    let mut func_ids: HashMap<String, egglog_bridge::FunctionId> = Default::default();


HashMap::with_capacity?

yihozhang · 2026-03-29T23:47:04Z

            exec_state,
            union_action: self.union_action,
-            table_actions: self.table_actions.clone(),
+            func_ids: self.func_ids.as_ref(),


Do func_ids and table_actions_by_id need to be Arc-wrapped, if they are passed by reference?

Could you just keep the original table_actions but pass by reference here (instead of self.table_actions.clone()), and not have the indirections with func_ids, table_actions_by_id, and table_actions_cache_by_id? I'm not quite convinced that these new structs provide speedup

yihozhang · 2026-03-29T23:51:06Z

+
+    let mut func_ids: HashMap<String, egglog_bridge::FunctionId> = Default::default();
+    func_ids.reserve(egraph.functions.len());
+    let mut table_actions_by_id: Vec<Option<egglog_bridge::TableAction>> = Vec::new();


nit: reserve the vec once (by scanning egraph.functions) instead of repeatedly resizing

yihozhang · 2026-03-29T23:58:22Z

+            .unwrap_or_else(|| panic!("missing table action for FunctionId({})", table.rep()))
+    }
+
+    fn get_table_action_cache_by_id<'a>(


I don't understand this function? Is this caching the lookups to a hashmap with a (Id, TableAction) vector, because supposedly looking up the vector is faster than a hashmap?

MilkBlock · 2026-04-04T18:20:28Z

    /// Insert a row into a table.
    /// For more information, see `egglog_bridge::TableAction::insert`.
    pub fn insert(&mut self, table: &str, row: impl Iterator<Item = Value>) {
-        self.get_table_action(table).insert(self.exec_state, row)


This get_table_action call is expensive. Because it clones the TableAction every call. But this is redundant because though insert requires mutable borrow of TableAction but its so-called mutable borrow only change the capacity of scratch (because scratch will be cleared every call and other part will not change after first set).

So it should be cached to improve the performance.

pub struct TableAction { table: TableId, // not change after first set table_math: SchemaMath, // not change after first set default: Option<MergeVal>, // not change after first set timestamp: CounterId, // not change after first set scratch: Vec<Value>, // in fact scratch is a tmp variable will be `clear` in every TableOperation }

There are only 4 kinds of TableOperations need TableAction : look_up, insert, delete, subsume.

look_up & remove require & borrow
insert & delete require &mut borrow but it's fake(only touch capacity of scratch).

@yihozhang Basing on this observation, we can safely degrade the signature of insert and subsume from &mut TableAction to & TableAction. So I can make this PR more readable.

MineralSteins and others added 4 commits March 17, 2026 03:31

perf(rust_rule): cache FunctionId/TableAction at registration

f38f176

nit

c6701da

bench(rust_rule): restore match_overhead benches

1041d90

Merge branch 'egraphs-good:main' into rust-rule-benchmarking-tableact…

16695db

…ion-cache

MilkBlock requested a review from a team as a code owner March 17, 2026 09:55

MilkBlock requested review from saulshanabrook and removed request for a team March 17, 2026 09:55

add bench to build.yml

c420691

MineralSteins and others added 2 commits March 17, 2026 18:14

Merge remote-tracking branch 'refs/remotes/milkblock/rust-rule-benchm…

0903aad

…arking-tableaction-cache' into rust-rule-benchmarking-tableaction-cache

Merge branch 'egraphs-good:main' into rust-rule-benchmarking-tableact…

7de2394

…ion-cache

yihozhang requested a review from Copilot March 29, 2026 23:38

Copilot started reviewing on behalf of yihozhang March 29, 2026 23:38 View session

Copilot AI reviewed Mar 29, 2026

View reviewed changes

yihozhang requested changes Mar 30, 2026

View reviewed changes

MilkBlock commented Apr 4, 2026

View reviewed changes

MilkBlock marked this pull request as draft April 4, 2026 18:59

	let idx = v.backend_id.rep() as usize;
	let idx = v.backend_id.index();

Conversation

MilkBlock commented Mar 17, 2026

Uh oh!

MilkBlock commented Mar 17, 2026

Uh oh!

codecov-commenter commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codspeed-hq Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by ×21

Performance Changes

Footnotes

Uh oh!

MilkBlock commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

yihozhang Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

yihozhang Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

yihozhang Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

yihozhang Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

yihozhang Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

MilkBlock Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

MilkBlock Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Mar 17, 2026 •

edited

Loading

codspeed-hq Bot commented Mar 17, 2026 •

edited

Loading