refactor: propagate shared config via before validator#2291
Open
mikasenghaas wants to merge 6 commits intomainfrom
Open
refactor: propagate shared config via before validator#2291mikasenghaas wants to merge 6 commits intomainfrom
mikasenghaas wants to merge 6 commits intomainfrom
Conversation
58da201 to
df61dec
Compare
Move model name/VLM propagation from auto_setup_model (after validator) to a new _propagate_shared_model (before validator). This ensures sub-config validators see the final model name when they run, which is critical for features like parser auto-resolution that depend on the model name at config construction time. The after validator auto_setup_model is simplified to only validate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
df61dec to
1f66066
Compare
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move output_dir, ckpt, wandb, tokenizer, and seq_len propagation from after-validators into the before-validator auto_setup_shared_configs, using uniform fill-if-absent semantics. Sub-config values now always take precedence; mismatches surface via the merged validate_shared_configs after-validator instead of being silently clobbered. Session header setup moves onto OrchestratorConfig since it only touches orchestrator state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… callers - Extract seq_len consistency check from rl.py into validate_shared_seq_len alongside the other validate_shared_* helpers - Remove the "skip if value is None" guard from propagate; each caller now explicitly gates on "if val is not None" which makes propagation intent visible at the call site Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a61e1d4. Configure here.
When RLConfig is built via `cli(RLConfig, ...)` tyro first realizes sub-configs from the TOML default, then constructs a new RLConfig with CLI overrides. The `mode=before` validator then sees `data["trainer"]` as a TrainerConfig instance rather than a dict, so fill-if-absent silently skips every shared field — `--output-dir` never reached sub-configs, leaving trainer and orchestrator writing to different directories. Dump BaseModel sub-configs with `exclude_defaults=True` before filling, recursively preserving discriminator `type` fields so union variants still resolve after re-validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikasenghaas
commented
Apr 20, 2026
|
|
||
| data = deepcopy(data) | ||
|
|
||
| # tyro may pass already-constructed sub-config instances rather than |
Member
Author
There was a problem hiding this comment.
@samsja this is the only ugliness in the pr atm which i think we can/should fix at the tyro/pydantic config level. imo we should try to avoid resolving any config more than once
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Reorganize all shared-config propagation on
RLConfiginto a singlemode="before"validator (auto_setup_shared_configs), so sub-config validators see final values at construction time.model,log,ckpt,wandb,tokenizer,seq_len,max_steps,max_async_level,output_dirvalidate_shared_*after-validators into onevalidate_shared_configsafter-validatorauto_setup_tokenizerthat had to correct state written by sub-config validators running before model propagationauto_setup_session_headersontoOrchestratorConfig(it only touches orchestrator state, no cross-sub-config logic)RLConfig.max_model_len— declared as a shared field but never read or propagated anywhereWhy
Pydantic constructs nested models before running parent validators. Under the old design,
auto_setup_modelran afterModelConfighad already been constructed with default values, so any sub-config validator depending on the final shared value (e.g. parser auto-resolution) saw the wrong input. Moving propagation into amode=\"before\"validator lets sub-configs see the final shared values on their first validation pass.This also unblocks parser auto-resolution at config time (PR #2290).
🤖 Generated with Claude Code
Note
Medium Risk
Changes Pydantic validation order for
RLConfigby moving shared-field propagation to amode="before"validator, which can subtly alter how defaults/CLI/TOML merges resolve and what sub-config validators see. Moderate risk of breaking existing config edge cases despite added unit coverage.Overview
Refactors
RLConfigshared config propagation to run in a singlemode="before"validator (auto_setup_shared_configs), somodel/log/ckpt/wandb/tokenizer/seq_len/max_steps/max_async_level/output_dirare filled into trainer/orchestrator/inference before nested models are constructed (with sub-config values taking precedence).Collapses multiple per-field after-validators into one
validate_shared_configsafter-validator, extracts theseq_lenconsistency check intovalidate_shared_seq_len, movesX-Session-IDheader auto-setup ontoOrchestratorConfig, removes unusedRLConfig.max_model_len, and adds focused unit tests covering propagation/precedence and CLI merge behavior.Reviewed by Cursor Bugbot for commit 1077bcc. Bugbot is set up for automated code reviews on this repo. Configure here.