Skip to content

Commit 6eeb195

Browse files
oshaughnclaude
andcommitted
calmarg demo: raise ILE request_memory 8192->16384 (extrinsic sampler spikes)
8192 still held the wide ILE jobs ("over cgroup memory limit of 8192"). The cal draws do NOT double here (they stay at 100); the memory driver is the AV extrinsic sampler spinning toward --n-max 4e6 on pathological low-cal-n_eff points, accumulating sample arrays past 8 GB. Completers peak ~7.3 GB; 16384 (2.2x) covers the hard-point spikes and still matches most GPU nodes (median ~27 GB RAM). ILE_extr auto-gets 2x. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 0fa697e commit 6eeb195

1 file changed

Lines changed: 8 additions & 6 deletions

File tree

  • MonteCarloMarginalizeCode/Code/demo/rift/calmarg

MonteCarloMarginalizeCode/Code/demo/rift/calmarg/Makefile

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -541,12 +541,14 @@ endif
541541
# OVERRIDES the CLI, which previously left ILE stuck at 4M while cip/general got bumped.
542542
PP_DISK_FLAGS := --internal-ile-request-disk $(PP_DISK) --internal-cip-request-disk $(PP_DISK) --internal-general-request-disk $(PP_DISK)
543543

544-
# ILE memory request (Mb). pseudo_pipe defaults to 4096, which is too tight for the
545-
# FUSED calmarg precompute: it holds N cal realizations and the adaptive draw count
546-
# DOUBLES (NCAL 100 -> up to 800) per intrinsic point, so peak RSS spikes past 4 GB and
547-
# the job is held ("memory usage exceeded request_memory"). 8192 matches the historical
548-
# standard (~2.4x the observed ~3.4 GB peak).
549-
PP_MEM_ILE ?= 8192
544+
# ILE memory request (Mb). pseudo_pipe defaults to 4096, far too tight here. Each job
545+
# does 50 intrinsic points with the fused calmarg + distance-marg extrinsic sampler; the
546+
# well-behaved points finish at ~7.3 GB, but pathological low-cal-n_eff points spin the AV
547+
# sampler toward --n-max 4e6 and accumulate sample arrays past 8 GB -> held ("over cgroup
548+
# memory limit"). Observed: 4096 and 8192 both held; completers peak ~7.3 GB. 16384 (2.2x)
549+
# covers the hard-point spikes and still matches most GPU nodes (median ~27 GB RAM). Raise
550+
# to 24576 if any still hold; ILE_extr auto-gets 2x this.
551+
PP_MEM_ILE ?= 16384
550552
PP_MEM_FLAGS := --internal-ile-request-memory $(PP_MEM_ILE)
551553

552554
.PHONY: pp-run pp-run-build pp-coinc

0 commit comments

Comments
 (0)