Skip to content

Use better slope estimate and geometry for less intcode#73

Open
ebblake wants to merge 1 commit into
maneatingape:mainfrom
ebblake:2019-19
Open

Use better slope estimate and geometry for less intcode#73
ebblake wants to merge 1 commit into
maneatingape:mainfrom
ebblake:2019-19

Conversation

@ebblake

@ebblake ebblake commented May 18, 2026

Copy link
Copy Markdown
Contributor

Description

Cut the number of test() runs of intcode (on my input) from 453 to
just 150, by spending a little bit of time during parse to refine the
slope estimate for fewer false positives in part 1, and then by using
that more accurate slope to start part 2 much closer to the actual
answer.

On my laptop, this cuts parse/part1/part2 runtime from 200/215/540us
to 97/199/42us.

Type of change

  • Performance improvement
  • Bug fix
  • Other

Checklist

  • Pull request title and commit messages are clear and informative.
  • Documentation has been updated if necessary.
  • Code style matches the existing code. This one is somewhat subjective, but try to "fit in" by
    using the same naming conventions. Code should be portable, avoiding any
    architecture-specific intrinsics.
  • Tests pass cargo test
  • Code is formatted cargo fmt -- `find . -name "*.rs"`
  • Code is linted cargo clippy --all-targets --all-features

Formatting and linting also can be executed by running just
(if installed) on the command line at the project root.

Comment thread src/year2019/day19.rs Outdated
Comment thread src/year2019/day19.rs Outdated
Comment thread src/year2019/day19.rs Outdated
Comment thread src/year2019/day19.rs Outdated
Comment thread src/year2019/day19.rs Outdated
@ebblake

ebblake commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

Eric's official sample shows that the black box is just computing a linear inequality: abs(c1*x*x - c2*y*y) < c3*x*y for three constants chosen so that there are no integer hits on y=1. The slopes are irrational, but the better you can approximate the slope, the fewer times you have to call into intcode to see if rounding boundaries of the actual slopes match your estimates just outside the slope.

Starting scale at 50 requires about 100 test() to find the first estimate. Each later scale by 2 requires between 2 and 6 further probes (possibly between 0 and 4 if the loop logic is improved) to cut the error between the approximate and actual slope in half. With no pre-filtering by approximate slope, crawling the diagonals on part 1 needs around 200 test() (better than brute force 2500 probes) and part 2 walking along the diagonal until reaching the winning spot needs around 700-1000 probes. But with pre-filtering and intelligent starting points, part 1 can probe fewer than 100 times, and part 2 around 10 times.

You could also break the black box and extract the 3 constants to solve this exactly with zero lines of intcode run.

@ebblake

ebblake commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

The following hack (post-patch) makes it easier to see the effects of different scale and part 2 starting values:

diff --git c/src/year2019/day19.rs i/src/year2019/day19.rs
index 5d09a42..0a53508 100644
--- c/src/year2019/day19.rs
+++ i/src/year2019/day19.rs
@@ -53,20 +53,20 @@ pub struct Input {
 pub fn parse(input: &str) -> Input {
     // Pick an initial scale large enough to be past the discontinuities for all known inputs.
     let code: Vec<_> = input.iter_signed().collect();
-    let mut lower = 1;
-    let mut upper = 1;
-    let mut scale = 5;
+    let mut lower = 0;
+    let mut upper = 0;
+    let mut scale = 25;

     // Find approximate slope of lower and upper edges, rounding down to prevent false negatives.
     // Scale the boundary for slightly more accuracy.
-    while scale < 1024 {
+    while scale < 50 {
         scale *= 2;
         lower *= 2;
         upper *= 2;
-        while !test(&code, lower + 1, scale) {
+        while !test(&code, lower + 1, scale, 0) {
             lower += 1;
         }
-        while !test(&code, scale, upper + 1) {
+        while !test(&code, scale, upper + 1, 0) {
             upper += 1;
         }
     }
@@ -81,9 +81,9 @@ pub fn part1(input: &Input) -> i64 {

     // Scanning the remaining lines works even around the known discontinuity at y=1, by finding
     // the left and right edges if any.
-    for y in 2..50 {
-        let left = (1..50).find(|&x| precheck(input, x, y) && test(code, x, y));
-        let right = (1..50).rfind(|&x| precheck(input, x, y) && test(code, x, y));
+    for y in 0..50 {
+        let left = (0..50).find(|&x| precheck(input, x, y, 1) && test(code, x, y, 1));
+        let right = (0..50).rfind(|&x| precheck(input, x, y, 1) && test(code, x, y, 1));
         if let (Some(l), Some(r)) = (left, right) {
             result += r - l + 1;
         }
@@ -99,17 +99,18 @@ pub fn part2(input: &Input) -> i64 {
         / (input.scale * input.scale - input.lower * input.upper);
     let mut y = input.scale * x / input.lower - 99;
     let mut moved = true;
+    x = 0; y = 0;

     // Increase the right and bottom edges of our box until they are both inside the beam.
     while moved {
         moved = false;

-        while !precheck(input, x, y + 99) || !test(code, x, y + 99) {
+        while !precheck(input, x, y + 99, 2) || !test(code, x, y + 99, 2) {
             x += 1;
             moved = true;
         }

-        while !precheck(input, x + 99, y) || !test(code, x + 99, y) {
+        while !precheck(input, x + 99, y, 2) || !test(code, x + 99, y, 2) {
             y += 1;
             moved = true;
         }
@@ -119,12 +120,14 @@ pub fn part2(input: &Input) -> i64 {
 }

 /// Quick check with some false positives but no false negatives.
-fn precheck(input: &Input, x: i64, y: i64) -> bool {
+fn precheck(input: &Input, x: i64, y: i64, p: u8) -> bool {
+    println!("precheck{p} {x} {y}");
     input.scale * y > input.upper * x && input.scale * x > input.lower * y
 }

 /// Definitive but slower check.
-fn test(code: &[i64], x: i64, y: i64) -> bool {
+fn test(code: &[i64], x: i64, y: i64, p: u8) -> bool {
+    println!("test{p} {x} {y}");
     let mut computer = Computer::new(code);
     computer.input(x);
     computer.input(y);

when run with

cargo run year2019::day19 | sed 's/ .*//' | sort | uniq -c

With scale = 50 and x,y of 0,0 (the pre-patch condition), on my input, it produces:

   2717 precheck1
   1882 precheck2
     93 test0
     94 test1
    266 test2

With scale starting at 10 and grown past 1024, and x,y determined by geometry, it produces:

   2464 precheck1
     29 precheck2
     39 test0
     86 test1
     20 test2

Additionally, you can inspect how often a given point is being probed more than once, to decide if memoization is worthwhile. Pre-patch:

$ cargo run year2019::day19 | sort | uniq -c | sed 's/\([tk][012]\).*/\1/' | sort | uniq -c
...
   2009       1 precheck1
   1878       1 precheck2
     93       1 test0
     86       1 test1
    262       1 test2
    354       2 precheck1
      2       2 precheck2
      4       2 test1
      2       2 test2

Post-patch:

   1966       1 precheck1
     25       1 precheck2
     39       1 test0
     78       1 test1
     16       1 test2
    249       2 precheck1
      2       2 precheck2
      4       2 test1
      2       2 test2

So no point is being probed 3 times (good), and the number of points probed twice is fairly small (unchanged by my patch), and memoization is going to cost more memory than time saved. Further improvements could be made by enhancing part 1 to more closely scan the two lines (advancing left and right independently by 0, 1, or 2) rather than doing a binary search on EACH line and therefore having a several wasted calls to test() when precheck() returns true for a point definitively in the middle of the line, rather than on the 2-cell border that needs actual probing.

@ebblake ebblake force-pushed the 2019-19 branch 3 times, most recently from c46d358 to fbc6cb7 Compare May 18, 2026 13:10
@ebblake

ebblake commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

I also tried scale>2048 - it cost 4 more test in parse but saved 7 in part2. That is starting to reach the point of diminishing returns

Cut the number of test() runs of intcode (on my input) from 453 to
just 145, by improving parse to come up with a more accurate slope
estimate in fewer probes, for fewer false positives in part 1, and
then by using that more accurate slope to start part 2 much closer to
the actual answer.

On my laptop, this cuts parse/part1/part2 runtime from 200/215/540us
to 97/199/42us.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant