You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Speed
Bucketing same-shape params batched into one kernel call, 10–20% faster. Less launch overhead, better batching in operations that support it
New Optimizers
HeavyKLSOAP / HeavyKLShampoo use Moore–Penrose pseudo-inverse instead of eps-regularized inverse on the KL eigenvalues, so collapsed directions don't blow up
HeavyKLSOAP / HeavySOAP / HeavySOAPNAdam / HeavySOAPLaProp / HeavySOAPAdEMAMix / HeavySOLP family tracks the eigenbasis with all moments
Bugfixes
ADOPT previously ran SGD
SAM no longer recompiles per ball_size
Caution flag no longer leaks across multi_tensor=False params
Breaking
Checkpoints load under weights_only=True -- this might cause issues with pickled custom precond schedules
Unknown kwargs are now errors, not warnings. Typos fail at construction. -- this will cause issues with custom functions that rely on uncaptured kwargs!