Skip to content

Commit 0327921

Browse files
hmgaudeckerclaude
andcommitted
Add n_wage_res_batch_size: splay wage_res shock axis
Wires `RouwenhorstAR1Process.batch_size` through `GridConfig` so the wage-residual stochastic productmap can be split with an inner Python loop. At `n_wage_res_batch_size=1` the per-target Q_and_F intermediate shrinks by `n_wage_res_gridpoints` (5), bringing the kernel under 80 GB for the ACA-overlay nongroup_nomc_* regimes where the unsplayed kernel hit 144 GB. `n_pref_type_batch_size` remains a no-op pending separate splay wiring. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 77ba8e4 commit 0327921

2 files changed

Lines changed: 9 additions & 0 deletions

File tree

src/aca_model/baseline/regimes/_common.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,6 +242,7 @@ def build_grids(
242242
rho=_WAGE_RHO,
243243
sigma=(1.0 - _WAGE_RHO**2) ** 0.5,
244244
mu=0.0,
245+
batch_size=grid_config.n_wage_res_batch_size,
245246
)
246247
hcc_persistent = get_hcc_persistent_shock(grid_config=grid_config)
247248
hcc_transitory = NormalIIDProcess(

src/aca_model/config.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,14 @@ class GridConfig:
4242
# all pref-types. Defaults to `0` — the production overrides set it
4343
# to `1` on hardware where the unsplayed kernel doesn't fit.
4444
n_pref_type_batch_size: int = 0
45+
# `batch_size` on the `wage_res` stochastic shock process: chunked
46+
# productmap stride along the wage-residual stoch axis inside Q_and_F.
47+
# `1` shrinks the per-target Q intermediate by `n_wage_res_gridpoints`
48+
# at the cost of an inner Python loop; `0` lets the productmap span
49+
# the full axis. Defaults to `0` — production overrides set it to `1`
50+
# on hardware where the ACA-overlay per-cell DAG blows the kernel's
51+
# compile-time working set past device HBM.
52+
n_wage_res_batch_size: int = 0
4553
# Per-device chunk size for the simulate-side per-subject dispatch,
4654
# keyed by `log_level`. Empty → 0 (no chunking) for every level.
4755
# `log_level="off"` skips `validate_V` and its forced host-sync, which

0 commit comments

Comments
 (0)