Use A* search for less overall work by ebblake · Pull Request #74 · maneatingape/advent-of-code-rust

ebblake · 2026-05-21T13:59:08Z

Description

The search can be steered towards the global minimum by pre-computing a heuristic of the absolute minimum possible distance that each key can contribute, then updating that heuristic with a subtraction per key visited, making it a nice lightweight O(1) heuristic. By the time we reach the goal of zero keys remaining, the heuristic also reaches zero, so it is consistent and we do not have to worry about revisiting a node with a lower distance from an alternative path. Then, using an O(n) heuristic for part 2 (but not part 1) got even more speed by pruning much more of the search space.

On my laptop, this improves runtimes dramtically, even if the numbers are a bit noisy. A more direct analysis, by instrumenting the number of cache accesses and items of work processed when using my input file, shows that part 2 sees the bigger benefit:

                        cache.entry()  todo.push()  todo.pop() runtime
    part 1 pre-series    101494         23195        22762      5.2ms
    part 1 post-series    65546         23019        21410      4.6ms
    part 2 pre-series    120813         33238        31437     12.1ms
    part 2 post-series    33577         13826         6689      2.3ms

Type of change

Performance improvement
Bug fix
Other

Checklist

Pull request title and commit messages are clear and informative.
Documentation has been updated if necessary.
Code style matches the existing code. This one is somewhat subjective, but try to "fit in" by
using the same naming conventions. Code should be portable, avoiding any
architecture-specific intrinsics.
Tests pass cargo test
Code is formatted cargo fmt -- `find . -name "*.rs"`
Code is linted cargo clippy --all-targets --all-features

Formatting and linting also can be executed by running just
(if installed) on the command line at the project root.

ebblake · 2026-05-21T14:02:10Z

just gave me a weird warning that seemed unrelated to my original patch, hence my s/maze/matrix/ to silence it:

cargo fmt -- `find . -name "*.rs"`
cargo clippy --all-targets --all-features
warning: field name starts with the struct's name
  --> src/year2019/day18.rs:80:5
   |
80 |     maze: [[Door; 30]; 30],
   |     ^^^^^^^^^^^^^^^^^^^^^^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#struct_field_names
   = note: requested on the command line with `-W clippy::struct-field-names`

ebblake · 2026-05-21T15:53:41Z

Here's an alternative O(n) heuristic. It made part1 almost twice as long (the number of keys reachable from one bot is high enough that a lot of time is being spent on computing the max leg remaining), but part2 noticeably faster (with four bots, the effort to find max leg is spread out more evenly, and max leg rather than sum of min legs steers better, even though it requires more effort to compute). Maybe a hybrid is worthwhile? (No heuristic for part1, O(n) heuristic for part 2)

From 84c924e844b0da501c57c7c3aafb41e46b05b25d Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Thu, 21 May 2026 10:37:28 -0500
Subject: [PATCH] tmp
Content-type: text/plain

---
 src/year2019/day18.rs | 46 ++++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/src/year2019/day18.rs b/src/year2019/day18.rs
index 606c2f7..536b932 100644
--- a/src/year2019/day18.rs
+++ b/src/year2019/day18.rs
@@ -72,16 +72,16 @@ struct Door {
     needed: u32,
 }

+type Matrix = [[Door; 30]; 30];
+
 /// `initial` is the complete set of keys that we need to collect. Will always be binary
 /// `11111111111111111111111111` for the real input but fewer for sample data.
 ///
 /// `matrix` is the adjacency of distances and doors between each pair of keys and the robots
 /// starting locations.
-/// `minimum` is the smallest distance from a key to any of its neighbors, for the A* heuristic.
 struct Maze {
     initial: State,
-    matrix: [[Door; 30]; 30],
-    minimum: [u32; 26],
+    matrix: Matrix,
 }

 pub fn parse(input: &str) -> Grid<u8> {
@@ -181,25 +181,19 @@ fn parse_maze(width: usize, bytes: &[u8]) -> Maze {
         }
     }

-    let mut minimum = [0; 26];
-    for i in initial.remaining.biterator() {
-        minimum[i] =
-            matrix[i].iter().map(|d| d.distance).filter(|&dist| dist > 0).min().unwrap_or(0);
-    }
-
-    Maze { initial, matrix, minimum }
+    Maze { initial, matrix }
 }

 fn explore(width: usize, bytes: &[u8]) -> u32 {
     let mut todo = MinHeap::with_capacity(5_000);
     let mut cache = FastMap::with_capacity(5_000);

-    let Maze { initial, matrix, minimum } = parse_maze(width, bytes);
-    let heuristic = minimum.iter().sum();
-    todo.push(heuristic, (initial, heuristic));
+    let Maze { initial, matrix } = parse_maze(width, bytes);
+    let heur = heuristic(initial, &matrix);
+    todo.push(heur, (initial, heur));

-    while let Some((guess, (State { position, remaining }, heuristic))) = todo.pop() {
-        let total = guess - heuristic;
+    while let Some((guess, (State { position, remaining }, heur))) = todo.pop() {
+        let total = guess - heur;
         // Finish immediately if no keys left.
         // Since we're using A* with a consistent heuristic this will always be the optimal solution.
         if remaining == 0 {
@@ -220,14 +214,14 @@ fn explore(width: usize, bytes: &[u8]) -> u32 {
                         position: position ^ from_mask ^ to_mask,
                         remaining: remaining ^ to_mask,
                     };
-                    let next_heuristic = heuristic - minimum[to];
-                    let next_guess = total + distance + next_heuristic;
+                    let next_heur = heuristic(next_state, &matrix);
+                    let next_guess = total + distance + next_heur;

                     // Memoize previously seen states to eliminate suboptimal states right away.
                     let best = cache.entry(next_state).or_insert(u32::MAX);
                     if next_guess < *best {
                         *best = next_guess;
-                        todo.push(next_guess, (next_state, next_heuristic));
+                        todo.push(next_guess, (next_state, next_heur));
                     }
                 }
             }
@@ -245,3 +239,19 @@ fn is_key(b: u8) -> Option<usize> {
 fn is_door(b: u8) -> Option<usize> {
     b.is_ascii_uppercase().then(|| (b - b'A') as usize)
 }
+
+fn heuristic(state: State, matrix: &Matrix) -> u32 {
+    let mut heur = 0;
+
+    // For each robot, compute the worst-case distance it must travel to a remaining key.
+    for bot in state.position.biterator() {
+        let mut dist = 0;
+        for key in state.remaining.biterator() {
+            if matrix[bot][key].distance != u32::MAX {
+                dist = dist.max(matrix[bot][key].distance);
+            }
+        }
+        heur += dist;
+    }
+    heur
+}
-- 
2.54.0

with this alternative, tracing shows

                    cache.entry()  todo.push()  todo.pop()  runtime
part 1 pre-patch    101494         23195        22762        5.3ms
part 1 O(1)-patch    96755         23054        21528        5.0ms
part 1 O(n)-patch   110269         25376        24334       10.1ms
part 2 pre-patch    120813         31437        33238       12.3ms
part 2 O(1)-patch    99325         30438        24419       10.7ms
part 2 O(n)-patch    33625         13826         6689        8.0ms

ebblake · 2026-05-21T18:16:51Z

Gemini AI suggests this approach for a hybrid, that picks the best heuristic for both parts:


// The const generic flag tells the compiler to specialize this function
fn explore<const IS_PART_2: bool>(width: usize, bytes: &[u8]) -> u32 {
    let mut todo = MinHeap::with_capacity(5_000);
    let mut cache = FastMap::with_capacity(5_000);
    let Maze { initial, matrix, minimum } = parse_maze(width, bytes);

    // Compute initial heuristic based on the compile-time flag
    let heur = if IS_PART_2 {
        heuristic_part2(initial, &matrix)
    } else {
        minimum.iter().sum()
    };

    todo.push(heur, (initial, heur));

    while let Some((guess, (state, heur))) = todo.pop() {
        // ... (hot loop logic) ...

        let next_heur = if IS_PART_2 {
            heuristic_part2(next_state, &matrix)
        } else {
            heur - minimum[to] // Fast O(1) step subtraction
        };
    }
    // ...
}

Pre-filtering the list of keys in the same quadrant as the starting point with a single bitwise op is faster than iterating over all keys and then doing a branch on whether the distance was u32::MAX just to toss out 75% of the iterations. Likewise, a given state can be queued with more than one distance as different paths percolate to the top of the priority queue. For my input, I traced that part1 pops a revisited state 6971 times (out of 22762 pops), and part2 2237 times (out of 31437). It is slightly faster to check the cached distance up front than to repeat next-neighbor checks that will not find any new neighbors (reducing the number of later cache accesses from 101494 to 70304 for part1, and 120813 to 113051 for part 2). While touching this, `just` complained that: warning: field name starts with the struct's name --> src/year2019/day18.rs:80:5 | 80 | maze: [[Door; 30]; 30], | ^^^^^^^^^^^^^^^^^^^^^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#struct_field_names = note: requested on the command line with `-W clippy::struct-field-names` so I renamed that field to matrix. On my laptop, part1 speeds up from 5.1ms to 4.8ms, and part2 is more dramtic from 12.1ms to 7.4ms.

The search can be steered towards the global minimum by pre-computing a heuristic of the absolute minimum possible distance that each key can contribute, then updating that heuristic with a subtraction per key visited, making it a nice lightweight O(1) heuristic. By the time we reach the goal of zero keys remaining, the heuristic also reaches zero - while it often underestimates, it is consistent and never overestimates. On my laptop with my input, and adding some analysis (although the runtime numbers are a bit noisy), I can see that both parts benefit, although part 2 picks up the best improvement. cache.entry() todo.push() todo.pop() runtime part 1 pre-patch 70304 23195 22762 4.8ms part 1 post-patch 65546 23019 21410 4.6ms part 2 pre-patch 113051 33238 31437 6.2ms part 2 post-patch 79301 26608 20196 4.8ms

An O(1) heuristic is always ideal, and for part 1, anything else that I tried cost more in overhead than what it saved in nodes visited. But for part 2, the exponential explosion caused by multiple robots really did benefit from a more responsive O(n) heuristic to prune the focus towards advancing the robot that would unlock the most keys, rather than the robot with the shortest distance. With this latest patch, my table of results now looks like: cache.entry() todo.push() todo.pop() runtime part 1 pre-series 101494 23195 22762 5.2ms part 1 pre-patch 65546 23019 21410 4.6ms part 1 this patch 65546 23019 21410 4.6ms part 2 pre-series 120813 33238 31437 12.1ms part 2 pre-patch 79301 26608 20196 4.8ms part 2 this patch 33577 13826 6689 2.3ms

ebblake · 2026-05-23T11:10:35Z

I'm still investigating if a dynamic programming approach similar to Held-Karp can outperform A*. One benefit of the dynamic approach is that you can solve both parts at once. With A*, your cache is on (keys,positions)->distance, which is two separate caches for the two parts. But with dynamic programming, the cache is (keys,last key)->(distance,positions) which can cram both parts into the same cache for less traversal overhead. The dynamic approach is up to O(n^2) per set visited (for each reachable key in a set, find the minimum distance when appending that key to any of the other reachable keys of the subset - in general n is much smaller than 26 because of the non-viable keys with a nonzero need), but I don't yet have a feel for how many set visits can be pruned from the reachability front to avoid a full-blown 2^26 set visits.

ebblake · 2026-05-27T12:40:50Z

I'm still investigating if a dynamic programming approach similar to Held-Karp can outperform A*.

Nope - although I did get a dynamic search working, it visits more sets+positions (31272 on my input - but still much better than a full 2^26 sets) than a directed A* search. At this point, I'm happy with the current state of the patch as being the fastest solution I can come up with.

ebblake force-pushed the 2019-18 branch 3 times, most recently from 8060dec to 91dec87 Compare May 21, 2026 15:25

ebblake commented May 21, 2026

View reviewed changes

Comment thread src/year2019/day18.rs Outdated

ebblake force-pushed the 2019-18 branch 3 times, most recently from cdfb4a0 to 8a0d745 Compare May 22, 2026 01:41

ebblake commented May 22, 2026

View reviewed changes

Comment thread src/year2019/day18.rs Outdated

ebblake commented May 22, 2026

View reviewed changes

Comment thread src/year2019/day18.rs

ebblake added 3 commits May 22, 2026 13:37

ebblake force-pushed the 2019-18 branch from 8a0d745 to 4c124e5 Compare May 22, 2026 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use A* search for less overall work#74

Use A* search for less overall work#74
ebblake wants to merge 3 commits into
maneatingape:mainfrom
ebblake:2019-18

ebblake commented May 21, 2026 •

edited

Loading

Uh oh!

ebblake commented May 21, 2026

Uh oh!

ebblake commented May 21, 2026 •

edited

Loading

Uh oh!

ebblake commented May 21, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ebblake commented May 23, 2026 •

edited

Loading

Uh oh!

ebblake commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ebblake commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Uh oh!

ebblake commented May 21, 2026

Uh oh!

ebblake commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ebblake commented May 21, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ebblake commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ebblake commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ebblake commented May 21, 2026 •

edited

Loading

ebblake commented May 21, 2026 •

edited

Loading

ebblake commented May 23, 2026 •

edited

Loading