Skip to content

refactor: nibble-pack relstatus arrays to halve their memory footprint#125

Merged
antLI-dev merged 1 commit into
masterfrom
refactor/snapspec-relstatus-nibble-pack
Jun 23, 2026
Merged

refactor: nibble-pack relstatus arrays to halve their memory footprint#125
antLI-dev merged 1 commit into
masterfrom
refactor/snapspec-relstatus-nibble-pack

Conversation

@antLI-dev

@antLI-dev antLI-dev commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Each relstatus row stored one SDC_STS_* value per byte. Since valid values are 1–6 (easily fitting in 4 bits), two values can be packed per byte. sdcRAAllocRow now allocates (row - col0 + 2) / 2 bytes instead of row - col0 + 1 bytes, and initialises each byte with two SDC_STS_SKIP values packed into its nibbles. All reads and writes use new explicitly static inline helpers: nibble_set and nibble_get.

Tested on the NZ national geodetic adjustment (national-geodetic-adjustment repo, combo branch, commit 3cab048, 122k-station network via nga.snp):

                    ngah (NZGD2000)     ngav (NZVD2016)
  Golden (old):     20m21s  6.54 GB    30m23s  8.09 GB
  single-cholesky:  21m01s  3.96 GB    32m13s  5.51 GB
  nibble-pack:      19m52s  3.38 GB    29m02s  4.78 GB

Memory reduction is ~0.6 GB for both runs. Run times are within system noise (range observed across multiple runs: ngah ~19–21 min, ngav ~29–32 min). Output is identical to the Golden baseline for both runs.

The reduction appears similar for ngah and ngav because peak relstatus memory is determined by the single largest order's relstatus footprint — this same memory is reused across all orders, so cumulative allocation across orders does not contribute to peak. The remaining ~1.4 GB gap between ngah and ngav peaks is not attributable to relstatus.

GSR-955

Each relstatus row stored one SDC_STS_* value per byte. Since valid
values are 1–6 (easily fitting in 4 bits), two values can be packed
per byte. sdcRAAllocRow now allocates (row - col0 + 2) / 2 bytes instead
of row - col0 + 1 bytes, and initialises each byte with two SDC_STS_SKIP
values packed into its nibbles. All reads and writes use new explicitly
static inline helpers: nibble_set and nibble_get.

Tested on the NZ national geodetic adjustment (national-geodetic-adjustment
repo, combo branch, commit 3cab048, 122k-station network via nga.snp):

                    ngah (NZGD2000)     ngav (NZVD2016)
  Golden (old):     20m21s  6.54 GB    30m23s  8.09 GB
  single-cholesky:  21m01s  3.96 GB    32m13s  5.51 GB
  nibble-pack:      19m52s  3.38 GB    29m02s  4.78 GB

Memory reduction is ~0.6 GB for both runs. Run times are within system
noise (range observed across multiple runs: ngah ~19–21 min, ngav ~29–32 min).
Output is identical to the Golden baseline for both runs.

The reduction appears similar for ngah and ngav because peak relstatus memory
is determined by the single largest order's relstatus footprint — this same
memory is reused across all orders, so cumulative allocation across orders
does not contribute to peak. The remaining ~1.4 GB gap between ngah and
ngav peaks is not attributable to relstatus.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@antLI-dev antLI-dev requested a review from ccrook June 23, 2026 20:38
@ccrook

ccrook commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Looks good to me... you weren't tempted to pack into two bytes per value (efficiency consideration?)

@antLI-dev

Copy link
Copy Markdown
Contributor Author

Looks good to me... you weren't tempted to pack into two bytes per value (efficiency consideration?)

This PR packs two values per byte. Are you asking why I didn't go for e.g. 3-bit packing and get 8 values per 3 bytes? If so, it's obviously more complicated and I thought it would be nice to allow room for some expansion of values in future (just in case) :)

@antLI-dev antLI-dev merged commit 7534f2e into master Jun 23, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants