Add anchor and hyperparameter-init controls to posterior_mean_function#829
Draft
kalama-ai wants to merge 1 commit into
Draft
Add anchor and hyperparameter-init controls to posterior_mean_function#829kalama-ai wants to merge 1 commit into
kalama-ai wants to merge 1 commit into
Conversation
- `anchors`: choose pretrained / new / combined data for the inner GP - `mean_kernel_init`: freeze, warmstart or discard the inner hyperparameters - Reject the no-op (new, discard) combo; warn on warmstart with new targets - Split out _resolve_anchors and _build_inner_gp as module-level helpers - Cover the new behaviour in tests/test_posterior_mean_function.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add
anchorsandmean_kernel_initcontrols toposterior_mean_functionWhat this does
The base branch added
GaussianProcessSurrogate.posterior_mean_function, which turns a trained GP's posterior mean into a mean module you can plug into a second GP. That gave us mean transfer, but with exactly one fixed behavior. This PR opens up that behavior with two keyword arguments:The defaults (
anchors="pretrained",mean_kernel_init="freeze") reproduce the existing behavior exactly, so nothing changes unless you opt in.The two knobs
anchorsdecides which data the inner GP is conditioned on when computing the transferred mean:"pretrained"— the source GP's own training data (recovered in raw space). This is the "pure transfer" case."new"— the new GP's measurements only."combined"— both, concatenated.mean_kernel_initdecides what happens to the inner GP's mean/kernel/likelihood once transfer is set up:"freeze"— deep-copy the pretrained modules and lock them (requires_grad=False). The inner mean is a fixed, static prior."warmstart"— deep-copy them but leave them trainable, so the outer MLL can keep adjusting them."discard"— throw away the pretrained hyperparameters and start the inner modules fresh from the factories, trainable.Importantly,
discardresets all three modules (mean, kernel, and likelihood), not just the kernel, because all three feed into the posterior mean.Guards and a warning
Not every combination is sound, so the method validates up front instead of letting things fail later:
anchors="new"+mean_kernel_init="discard"is rejected with aValueError— it transfers no pretrained information at all, so it would just be a plain GP refit dressed up as transfer learning.mean_kernel_init="warmstart"withanchors in {"new", "combined"}emits a warning. When the inner anchors include the new targets and the inner mean is free to move, the samey_newends up driving both the inner prior mean and the outer marginal likelihood. The flexible inner mean can then interpolatey_new, the outer residual collapses, and the MLL drives the outer noise toward zero — overconfident posteriors.freezeis always safe here, since a frozen inner mean can't chasey_new.