Use better slope estimate and geometry for less intcode by ebblake · Pull Request #73 · maneatingape/advent-of-code-rust

ebblake · 2026-05-18T03:54:44Z

Description

Cut the number of test() runs of intcode (on my input) from 453 to
just 150, by spending a little bit of time during parse to refine the
slope estimate for fewer false positives in part 1, and then by using
that more accurate slope to start part 2 much closer to the actual
answer.

On my laptop, this cuts parse/part1/part2 runtime from 200/215/540us
to 97/199/42us.

Type of change

Performance improvement
Bug fix
Other

Checklist

Pull request title and commit messages are clear and informative.
Documentation has been updated if necessary.
Code style matches the existing code. This one is somewhat subjective, but try to "fit in" by
using the same naming conventions. Code should be portable, avoiding any
architecture-specific intrinsics.
Tests pass cargo test
Code is formatted cargo fmt -- `find . -name "*.rs"`
Code is linted cargo clippy --all-targets --all-features

Formatting and linting also can be executed by running just
(if installed) on the command line at the project root.

ebblake · 2026-05-18T11:18:05Z

Eric's official sample shows that the black box is just computing a linear inequality: abs(c1*x*x - c2*y*y) < c3*x*y for three constants chosen so that there are no integer hits on y=1. The slopes are irrational, but the better you can approximate the slope, the fewer times you have to call into intcode to see if rounding boundaries of the actual slopes match your estimates just outside the slope.

Starting scale at 50 requires about 100 test() to find the first estimate. Each later scale by 2 requires between 2 and 6 further probes (possibly between 0 and 4 if the loop logic is improved) to cut the error between the approximate and actual slope in half. With no pre-filtering by approximate slope, crawling the diagonals on part 1 needs around 200 test() (better than brute force 2500 probes) and part 2 walking along the diagonal until reaching the winning spot needs around 700-1000 probes. But with pre-filtering and intelligent starting points, part 1 can probe fewer than 100 times, and part 2 around 10 times.

You could also break the black box and extract the 3 constants to solve this exactly with zero lines of intcode run.

ebblake · 2026-05-18T12:45:02Z

The following hack (post-patch) makes it easier to see the effects of different scale and part 2 starting values:

diff --git c/src/year2019/day19.rs i/src/year2019/day19.rs
index 5d09a42..0a53508 100644
--- c/src/year2019/day19.rs
+++ i/src/year2019/day19.rs
@@ -53,20 +53,20 @@ pub struct Input {
 pub fn parse(input: &str) -> Input {
     // Pick an initial scale large enough to be past the discontinuities for all known inputs.
     let code: Vec<_> = input.iter_signed().collect();
-    let mut lower = 1;
-    let mut upper = 1;
-    let mut scale = 5;
+    let mut lower = 0;
+    let mut upper = 0;
+    let mut scale = 25;

     // Find approximate slope of lower and upper edges, rounding down to prevent false negatives.
     // Scale the boundary for slightly more accuracy.
-    while scale < 1024 {
+    while scale < 50 {
         scale *= 2;
         lower *= 2;
         upper *= 2;
-        while !test(&code, lower + 1, scale) {
+        while !test(&code, lower + 1, scale, 0) {
             lower += 1;
         }
-        while !test(&code, scale, upper + 1) {
+        while !test(&code, scale, upper + 1, 0) {
             upper += 1;
         }
     }
@@ -81,9 +81,9 @@ pub fn part1(input: &Input) -> i64 {

     // Scanning the remaining lines works even around the known discontinuity at y=1, by finding
     // the left and right edges if any.
-    for y in 2..50 {
-        let left = (1..50).find(|&x| precheck(input, x, y) && test(code, x, y));
-        let right = (1..50).rfind(|&x| precheck(input, x, y) && test(code, x, y));
+    for y in 0..50 {
+        let left = (0..50).find(|&x| precheck(input, x, y, 1) && test(code, x, y, 1));
+        let right = (0..50).rfind(|&x| precheck(input, x, y, 1) && test(code, x, y, 1));
         if let (Some(l), Some(r)) = (left, right) {
             result += r - l + 1;
         }
@@ -99,17 +99,18 @@ pub fn part2(input: &Input) -> i64 {
         / (input.scale * input.scale - input.lower * input.upper);
     let mut y = input.scale * x / input.lower - 99;
     let mut moved = true;
+    x = 0; y = 0;

     // Increase the right and bottom edges of our box until they are both inside the beam.
     while moved {
         moved = false;

-        while !precheck(input, x, y + 99) || !test(code, x, y + 99) {
+        while !precheck(input, x, y + 99, 2) || !test(code, x, y + 99, 2) {
             x += 1;
             moved = true;
         }

-        while !precheck(input, x + 99, y) || !test(code, x + 99, y) {
+        while !precheck(input, x + 99, y, 2) || !test(code, x + 99, y, 2) {
             y += 1;
             moved = true;
         }
@@ -119,12 +120,14 @@ pub fn part2(input: &Input) -> i64 {
 }

 /// Quick check with some false positives but no false negatives.
-fn precheck(input: &Input, x: i64, y: i64) -> bool {
+fn precheck(input: &Input, x: i64, y: i64, p: u8) -> bool {
+    println!("precheck{p} {x} {y}");
     input.scale * y > input.upper * x && input.scale * x > input.lower * y
 }

 /// Definitive but slower check.
-fn test(code: &[i64], x: i64, y: i64) -> bool {
+fn test(code: &[i64], x: i64, y: i64, p: u8) -> bool {
+    println!("test{p} {x} {y}");
     let mut computer = Computer::new(code);
     computer.input(x);
     computer.input(y);

when run with

cargo run year2019::day19 | sed 's/ .*//' | sort | uniq -c

With scale = 50 and x,y of 0,0 (the pre-patch condition), on my input, it produces:

   2717 precheck1
   1882 precheck2
     93 test0
     94 test1
    266 test2

With scale starting at 10 and grown past 1024, and x,y determined by geometry, it produces:

   2464 precheck1
     29 precheck2
     39 test0
     86 test1
     20 test2

Additionally, you can inspect how often a given point is being probed more than once, to decide if memoization is worthwhile. Pre-patch:

$ cargo run year2019::day19 | sort | uniq -c | sed 's/\([tk][012]\).*/\1/' | sort | uniq -c
...
   2009       1 precheck1
   1878       1 precheck2
     93       1 test0
     86       1 test1
    262       1 test2
    354       2 precheck1
      2       2 precheck2
      4       2 test1
      2       2 test2

Post-patch:

   1966       1 precheck1
     25       1 precheck2
     39       1 test0
     78       1 test1
     16       1 test2
    249       2 precheck1
      2       2 precheck2
      4       2 test1
      2       2 test2

So no point is being probed 3 times (good), and the number of points probed twice is fairly small (unchanged by my patch), and memoization is going to cost more memory than time saved. Further improvements could be made by enhancing part 1 to more closely scan the two lines (advancing left and right independently by 0, 1, or 2) rather than doing a binary search on EACH line and therefore having a several wasted calls to test() when precheck() returns true for a point definitively in the middle of the line, rather than on the 2-cell border that needs actual probing.

ebblake · 2026-05-18T13:37:22Z

I also tried scale>2048 - it cost 4 more test in parse but saved 7 in part2. That is starting to reach the point of diminishing returns

Cut the number of test() runs of intcode (on my input) from 453 to just 145, by improving parse to come up with a more accurate slope estimate in fewer probes, for fewer false positives in part 1, and then by using that more accurate slope to start part 2 much closer to the actual answer. On my laptop, this cuts parse/part1/part2 runtime from 200/215/540us to 97/199/42us.