Add target_batch_size to to_dataframe for sharded target eval#389
Open
hmgaudecker wants to merge 1 commit into
Open
Add target_batch_size to to_dataframe for sharded target eval#389hmgaudecker wants to merge 1 commit into
hmgaudecker wants to merge 1 commit into
Conversation
The additional_targets DAG in to_dataframe materializes the full in-regime panel on one device. target_batch_size chunks that evaluation and pulls each chunk to host before the next, bounding the fused-DAG device workspace independently of the simulate's subject_batch_size — so the target eval can chunk even when subject_batch_size must stay 0 (a distributed/sharded grid). Defaults to the simulate's subject_batch_size; values are identical to the single-pass evaluation. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Benchmark comparison (main → HEAD)Comparing
|
This was referenced Jun 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a
target_batch_sizeparameter toSimulationResult.to_dataframe, chunking theadditional_targetsevaluation (and evicting each chunk to host) independently of the simulate'ssubject_batch_size.Why
subject_batch_size > 0cannot be combined with distributed (sharded) grids — the value-function array is sharded across devices and can't be gathered onto one, so pylcm rejects the combination. But under a shard the post-simulateadditional_targetseval still pulls the (host-resident, per-shard) panel back onto the mesh's device 0 and evaluates the target DAG over the whole population in one pass, which can exhaust that device. This knob bounds that eval's device residency without touching the sharded solve.target_batch_size=None(default) falls back to the simulate'ssubject_batch_size, so current behavior is unchanged.Test
tests/simulation/test_subject_batching.py::test_to_dataframe_targets_are_invariant_to_target_batch_size— simulates single-pass, then chunks only the target eval attarget_batch_size ∈ {2, 3, 100}(even split, uneven, chunk-larger-than-population), asserting the computed-target column is identical to the single-pass result.🤖 Generated with Claude Code