Skip to content

Solve puzzle with less inventory churn#80

Open
ebblake wants to merge 1 commit into
maneatingape:mainfrom
ebblake:2019-25.1
Open

Solve puzzle with less inventory churn#80
ebblake wants to merge 1 commit into
maneatingape:mainfrom
ebblake:2019-25.1

Conversation

@ebblake

@ebblake ebblake commented May 29, 2026

Copy link
Copy Markdown
Contributor

Description

Although a Gray code walk can get lucky (I have access to one input file that it solved in just 3 takes, 7 drops, and 7 move attempts; my laptop benchmarked this in 900us), others might not attempt to drop the final item until after traversing through 128 other Gray codes (another input file I tried took 69 takes, 73 drops, and 74 move attempts, and benchmark reports 3.7ms).

However, Intcode emulation is expensive (~1500 instructions for a drop, ~2000 for a take, and ~8500 for a move attempt; where there is some variability depending on the item name length and which in-memory array index the items live at), so anything we can do locally to avoid another Intcode command will likely speed up the overall puzzle. In this case, we can make two improvements: first, instead of comparing to see if we are a superset or subset of just the previous state (50% of the time, but only checks one adjacent state in the overall hypercube traversal), we can compare against all other settled states (even if the state we are a superset of was walked 32 or 64 steps ago in the Gray code path); minimizing the size of too_heavy and too_light helps keep this check fast, and the result of better pruning is fewer calls to attempt moves. Second, instead of always changing our current inventory to match the latest Gray code directions, we can defer inventory changes until we lack enough information from superset/subset checks, for fewer calls to take and drop between move attempts.

For the two input files I tested, the faster one remains unchanged at 900us and 3/7/7 take/drop/move, but the slower one improved to 2.8ms and a smaller 38/42/43 actions.

Type of change

  • Performance improvement
  • Bug fix
  • Other

Checklist

  • Pull request title and commit messages are clear and informative.
  • Documentation has been updated if necessary.
  • Code style matches the existing code. This one is somewhat subjective, but try to "fit in" by
    using the same naming conventions. Code should be portable, avoiding any
    architecture-specific intrinsics.
  • Tests pass cargo test
  • Code is formatted cargo fmt -- `find . -name "*.rs"`
  • Code is linted cargo clippy --all-targets --all-features

Formatting and linting also can be executed by running just
(if installed) on the command line at the project root.

Although a Gray code walk can get lucky (I have access to one input
file that it solved in just 3 takes, 7 drops, and 7 move attempts; my
laptop benchmarked this in 900us), others might not attempt to drop
the final item until after traversing through 128 other Gray codes
(another input file I tried took 69 takes, 73 drops, and 74 move
attempts, and benchmark reports 3.7ms).

However, Intcode emulation is expensive (~1500 instructions for a
drop, ~2000 for a take, and ~8500 for a move attempt; where there is
some variability depending on the item name length and which in-memory
array index the items live at), so anything we can do locally to avoid
another Intcode command will likely speed up the overall puzzle.  In
this case, we can make two improvements: first, instead of comparing
to see if we are a superset or subset of just the previous state (50%
of the time, but only checks one adjacent state in the overall
hypercube traversal), we can compare against all other settled states
(even if the state we are a superset of was walked 32 or 64 steps ago
in the Gray code path); minimizing the size of too_heavy and too_light
helps keep this check fast, and the result of better pruning is fewer
calls to attempt moves.  Second, instead of always changing our
current inventory to match the latest Gray code directions, we can
defer inventory changes until we lack enough information from
superset/subset checks, for fewer calls to take and drop between move
attempts.

For the two input files I tested, the faster one remains unchanged at
900us and 3/7/7 take/drop/move, but the slower one improved to 2.8ms
and a smaller 38/42/43 actions.
@ebblake

ebblake commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

Of course, if we're willing to peek inside the black box, we can do even better: it's relatively easy to determine the in-memory location of each item's power-of-two weight, sort the items in descending weight order, and then solve the puzzle with at most 8 move attempts. And if we decide that peeking in memory is kosher, it's not much harder to also poke memory to make inventory changes instant rather than going through the take/drop commands. Slightly harder, but even faster, would be to reverse-engineer the scoring system that converts the sum of weights into the output number to be printed on success.

@ebblake

ebblake commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

For reference (as much my own as anyone else's), here's a C benchmark program I wrote to compare various strategies. The pre-patch strategy most closely aligns to local_learning_gray (average estimated cost of 875k intcode instructions across all 8! permutations of initial inventory ordering and all 70 8-choose-4 targets, at the point starting where all 8 items are in inventory before the first move onto the pressure plate), and the post-patch strategy is global_learning_virtual_gray (estimated 152k intcode instructions). I also checked askalski's solution which I benchmarked as deduction; it focuses on fewer moves, but ends up with more take/drop inventory churn and a higher average estimated cost (325k intcode instructions, but smaller standard deviation).

#include <stdio.h>
#include <unistd.h>
#include <assert.h>
#include <strings.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>
#include <stdbool.h>

static int moves;
static int drops;
static int takes;
static int weights[8];
static int target;
typedef void (*strategy)(int);
static strategy visit;
#define HSIZE 512
static int histogram[HSIZE][HSIZE][HSIZE]; /* moves, drops, takes */

static int
move(int mask) {
  moves++;
  int value = 0;
  for (int i = 0; i < 8; i++) {
    if (mask & (1<<i))
      value |= 1<<weights[i];
  }
  if (value > target)
    return 1;
  if (value < target)
    return -1;
  return 0;
}

static int
take(int mask, int pos)
{
  assert(!(mask & (1<<pos)));
  takes++;
  return mask | (1<<pos);
}

static int
drop(int mask, int pos)
{
  assert(mask & (1<<pos));
  drops++;
  return mask & ~(1<<pos);
}

static int
score(int moves, int drops, int takes)
{
  return moves*8500 + drops*1500 + takes*2000;
}

static int
gray_code(int i)
{
  return i ^ (i >> 1);
}

static int
index_lsb(int i)
{
  return ffs(i)-1;
}

/* average 1343507 for 4-items (12 seconds), 1314337 for all targets (45s) */
__attribute__((unused))
static void
blind_gray(int mask)
{
  for (int i = 1; i<255; i++) {
    int curr = gray_code(i);
    int prev = gray_code(i-1);
    int changed = curr ^ prev;
    int index = index_lsb(changed);
    if ((curr & changed) == 0)
      mask = take(mask, index);
    else
      mask = drop(mask, index);
    if (move(mask) == 0)
      break;
  }
}

/* average 875648 for 4-items (9 seconds), 849491 for all targets */
__attribute__((unused))
static void
local_learning_gray(int mask)
{
  bool light[255] = {0};
  bool heavy[255] = {0};

  for (int i = 1; i<255; i++) {
    int curr = gray_code(i);
    int prev = gray_code(i-1);
    int changed = curr ^ prev;
    int index = index_lsb(changed);
    if ((curr & changed) == 0) {
      mask = take(mask, index);
      if (heavy[prev]) {
        heavy[curr] = true;
        continue;
      }
    } else {
      mask = drop(mask, index);
      if (light[prev]) {
        light[curr] = true;
        continue;
      }
    }
    switch (move(mask)) {
    case 0: return;
    case 1: heavy[curr] = true; break;
    case -1: light[curr] = true; break;
    default: assert(false);
    }
  }
}

/* average 553857 for 4-items (43s), 549643 for all targets (45s) */
__attribute__((unused))
static void
global_learning_gray(int mask)
{
  uint8_t light[255] = {0};
  uint8_t heavy[255] = {0};
  int light_count = 0;
  int heavy_count = 0;

  for (int i = 1; i<255; i++) {
    int curr = gray_code(i);
    int prev = gray_code(i-1);
    int changed = curr ^ prev;
    int index = index_lsb(changed);
    if ((curr & changed) == 0)
      mask = take(mask, index);
    else
      mask = drop(mask, index);
    bool skip = false;
    for (int j = 0; j < heavy_count; j++) {
      if ((curr & heavy[j]) == heavy[j]) {
        skip = true;
        break;
      }
    }
    for (int j = 0; !skip && j < light_count; j++) {
      if ((curr & light[j]) == curr) {
        skip = true;
        break;
      }
    }
    if (skip)
      continue;
    switch (move(mask)) {
    case 0: return;
    case 1: heavy[heavy_count++] = curr; break;
    case -1: light[light_count++] = curr; break;
    default: assert(false);
    }
  }
}

static int
sync_inv(int want, int have)
{
  for (int i = 0; i < 8; i++)
    if ((want ^ have) & (1<<i)) {
      if (have & (1 << i))
        drop(have, i);
      else
        take(have, i);
    }
  return have;
}

/* average 152232 for 4-items (12s), 162767 for all targets (45s) */
__attribute__((unused))
static void
global_learning_virtual_gray(int mask)
{
  uint8_t light[255] = {0};
  uint8_t heavy[255] = {0};
  int light_count = 0;
  int heavy_count = 0;
  int vmask = mask;

  for (int i = 1; i<255; i++) {
    int curr = gray_code(i);
    int prev = gray_code(i-1);
    int changed = curr ^ prev;
    int index = index_lsb(changed);
    if ((curr & changed) == 0)
      vmask |= (1<<index);
    else
      vmask &= ~(1<<index);
    bool skip = false;
    for (int j = 0; j < heavy_count; j++) {
      if ((curr & heavy[j]) == heavy[j]) {
        skip = true;
        break;
      }
    }
    for (int j = 0; !skip && j < light_count; j++) {
      if ((curr & light[j]) == curr) {
        skip = true;
        break;
      }
    }
    if (skip)
      continue;
    mask = sync_inv(mask, vmask);
    switch (move(mask)) {
    case 0: return;
    case 1: heavy[heavy_count++] = curr; break;
    case -1: light[light_count++] = curr; break;
    default: assert(false);
    }
  }
}

static int
check(int *have, int want, int keep, int weights[])
{
  want |= keep;
  int m = weights[want];
  if (!m) {
    *have = sync_inv(*have, want);
    m = move(want);
    if (m == 1) {
      for (int i = 0; i < 8; i++) {
        if ((1<<i) & (want ^ 255)) {
          weights[want | (1<<i)] = 1;
        }
      }
    }
  }
  return m;
}

/* average 324954 for 4-items (4.5s), 286601 for all targets (13s) */
__attribute__((unused))
static void
deduction(int mask)
{
  int unknown = 255;
  int keep = 0;
  int weights[256] = { 0 };
  do {
    for (int i = 0; i < 8; i++) {
      if ((1<<i) & unknown) {
        int c = check(&mask, 1 << i, keep, weights);
        if (!c) return;
        if (c < 0) continue;
        unknown ^= 1 << i;
      }
    }
    for (int i = 0; i < 8; i++) {
      if ((1<<i) & unknown) {
        int c = check(&mask, unknown ^ (1<<i), keep, weights);
        if (!c) return;
        if (c > 0) continue;
        unknown ^= 1 << i;
        keep |= 1 << i;
      }
    }
  } while (check(&mask, unknown, keep, weights) != 0);
}

static void
doit(int *p)
{
  assert(p == weights);
  //  printf("Trying %d%d%d%d%d%d%d%d\n", p[0],p[1],p[2],p[3],p[4],p[5],p[6],p[7]);
  moves = 0;
  drops = 0;
  takes = 0;
  visit(255);
  if (moves>HSIZE-1) moves=HSIZE-1;
  if (drops>HSIZE-1) drops=HSIZE-1;
  if (takes>HSIZE-1) takes=HSIZE-1;
  histogram[moves][drops][takes]++;
  printf("Score %d\n", score(moves, drops, takes));
}

static void
permute(int *p, int size, int n)
{
  int i, t;
  if (size == 1) {
    doit(p);
    return;
  }
  permute(p, size - 1, n);
  for (i = 0; i < size - 1; i++) {
    if (size & 1) {
      t = p[0];
      p[0] = p[size - 1];
      p[size - 1] = t;
    } else {
      t = p[i];
      p[i] = p[size - 1];
      p[size - 1] = t;
    }
    permute(p, size - 1, n);
  }
}

int
main(int argc, char **argv) {
  strategy strategies[] = {
    blind_gray,
    local_learning_gray,
    global_learning_gray,
    global_learning_virtual_gray,
    deduction,
  };
  int lo = 1;
  int hi = 255;
  int filter = argc > 2 ? atoi(argv[2]) : 1;
  int index = argc > 1 ? atoi(argv[1]) : 0;
  if (index < 0 || index > sizeof strategies / sizeof visit)
    return 1;
  visit = strategies[index];
  if (argc > 3)
    target = lo = hi = atoi(argv[3]);
  for (target = lo; target <= hi; target++) {
    if (filter && __builtin_popcount(target) != 4)
      continue;
    for (int i = 0; i < 8; i++)
      weights[i] = i;
    permute(weights, 8, 8);
  }

  uint64_t total_score = 0;
  int total_runs = 0;
  for (int m = 0; m < HSIZE; m++)
    for (int d = 0; d < HSIZE; d++)
      for (int t = 0; t < HSIZE; t++) {
        total_score += histogram[m][d][t] * (uint64_t) score(m, d, t);
        total_runs += histogram[m][d][t];
      }
  printf("Overall average: %"PRIu64"\n", total_score / total_runs);

  return 0;
}

AI suggests it might also be possible to write a directed search, rather than using Gray codes to follow an entire Hamiltonian path, while still skipping VM changes for the portion of the walk where the global superset/subset filtering is applicable. A directed search would start by immediately dropping 4 items, and then assign a probability to all 8 items (it suggested using an integer value 0-1024, initially 512 for 50%). Each time a set is too heavy, items in the set are penalized (reduce their probabilities by a percentage, say 25%); each time a set is too light, items on the floor are boosted (increase their probabilities by a percentage). After a set is tried and probabilities are updated, the code then computes all 8 possible Gray alterations to the set, tosses any such set whose outcome is already known, and then proceeds to try the remaining set with highest cumulative probability attached to each member of the set, while also trying to favor moves that reach another set of 4 (since it appears that all valid input files have a 4-out-of-8 target). This potential search algorithm can probably converge more quickly than a Gray code walk, but is more extensive to code, so it may add more overhead than what it can save in avoided Intcode commands. I do hope to give it a try, though.

@ebblake

ebblake commented Jun 7, 2026

Copy link
Copy Markdown
Contributor Author

This potential search algorithm can probably converge more quickly than a Gray code walk, but is more extensive to code, so it may add more overhead than what it can save in avoided Intcode commands. I do hope to give it a try, though.

I've since tried a probabalistic search, and it added to much complexity and potential for chasing dead ends to be with it. The deterministic Gray code walk is good enough, especially when coupled with smarter subset learning and minimized inventory changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant