Skip to content

Add target_batch_size to to_dataframe for sharded target eval#389

Open
hmgaudecker wants to merge 1 commit into
mainfrom
feat/target-batch-eviction
Open

Add target_batch_size to to_dataframe for sharded target eval#389
hmgaudecker wants to merge 1 commit into
mainfrom
feat/target-batch-eviction

Conversation

@hmgaudecker

Copy link
Copy Markdown
Member

Adds a target_batch_size parameter to SimulationResult.to_dataframe, chunking the additional_targets evaluation (and evicting each chunk to host) independently of the simulate's subject_batch_size.

Why

subject_batch_size > 0 cannot be combined with distributed (sharded) grids — the value-function array is sharded across devices and can't be gathered onto one, so pylcm rejects the combination. But under a shard the post-simulate additional_targets eval still pulls the (host-resident, per-shard) panel back onto the mesh's device 0 and evaluates the target DAG over the whole population in one pass, which can exhaust that device. This knob bounds that eval's device residency without touching the sharded solve.

target_batch_size=None (default) falls back to the simulate's subject_batch_size, so current behavior is unchanged.

Test

tests/simulation/test_subject_batching.py::test_to_dataframe_targets_are_invariant_to_target_batch_size — simulates single-pass, then chunks only the target eval at target_batch_size ∈ {2, 3, 100} (even split, uneven, chunk-larger-than-population), asserting the computed-target column is identical to the single-pass result.

🤖 Generated with Claude Code

The additional_targets DAG in to_dataframe materializes the full in-regime
panel on one device. target_batch_size chunks that evaluation and pulls each
chunk to host before the next, bounding the fused-DAG device workspace
independently of the simulate's subject_batch_size — so the target eval can
chunk even when subject_batch_size must stay 0 (a distributed/sharded grid).
Defaults to the simulate's subject_batch_size; values are identical to the
single-pass evaluation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@read-the-docs-community

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown

Benchmark comparison (main → HEAD)

Comparing dc591241 (main) → 2e42d0fc (HEAD)

Benchmark Statistic before after Ratio Alert
aca-baseline execution time 15.217 s 14.119 s 0.93
peak GPU mem 581 MB 581 MB 1.00
compilation time 338.71 s 350.44 s 1.03
peak CPU mem 6.80 GB 6.93 GB 1.02
aca-baseline-debug execution time 80.285 s 78.850 s 0.98
peak GPU mem 581 MB 581 MB 1.00
compilation time 450.65 s 460.87 s 1.02
peak CPU mem 7.89 GB 7.93 GB 1.01
Mahler-Yum execution time 4.487 s 4.499 s 1.00
peak GPU mem 520 MB 520 MB 1.00
compilation time 11.22 s 11.44 s 1.02
peak CPU mem 1.58 GB 1.59 GB 1.01
Precautionary Savings - Solve execution time 24.9 ms 24.4 ms 0.98
peak GPU mem 8 MB 8 MB 1.00
compilation time 1.59 s 1.57 s 0.99
peak CPU mem 1.16 GB 1.16 GB 1.00
Precautionary Savings - Simulate execution time 62.4 ms 63.9 ms 1.03
peak GPU mem 157 MB 157 MB 1.00
compilation time 3.61 s 3.53 s 0.98
peak CPU mem 1.33 GB 1.32 GB 0.99
Precautionary Savings - Solve & Simulate execution time 101.4 ms 88.2 ms 0.87
peak GPU mem 566 MB 566 MB 1.00
compilation time 4.78 s 5.02 s 1.05
peak CPU mem 1.31 GB 1.31 GB 1.00
Precautionary Savings - Solve & Simulate (irreg) execution time 202.7 ms 202.8 ms 1.00
peak GPU mem 2.18 GB 2.18 GB 1.00
compilation time 5.03 s 5.18 s 1.03
peak CPU mem 1.36 GB 1.36 GB 1.00
IskhakovEtAl2017Simulate execution time 194.1 ms 192.0 ms 0.99
compilation time 4.19 s 4.23 s 1.01
peak CPU mem 1.30 GB 1.29 GB 1.00
IskhakovEtAl2017Solve execution time 44.3 ms 44.3 ms 1.00
compilation time 0.66 s 0.72 s 1.08
peak CPU mem 1.15 GB 1.15 GB 1.00
IskhakovEtAl2017SimulateGpuPeakMem peak GPU mem 281 MB 281 MB 1.00
IskhakovEtAl2017SolveGpuPeakMem peak GPU mem 67 MB 67 MB 1.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant