Skip to content

[Feature] Add elementwise_affine argument to LigerRMSNorm#989

Merged
Tcc0403 merged 2 commits intolinkedin:mainfrom
niyunsheng:elementwise_affine_false_rebase
Dec 24, 2025
Merged

[Feature] Add elementwise_affine argument to LigerRMSNorm#989
Tcc0403 merged 2 commits intolinkedin:mainfrom
niyunsheng:elementwise_affine_false_rebase

Conversation

@niyunsheng
Copy link
Copy Markdown
Contributor

Summary

This PR adds the elementwise_affine argument to LigerRMSNorm to align its API with torch.nn.RMSNorm. This allows users to instantiate the normalization layer without learnable parameters.

Details

  • Added elementwise_affine (defaulting to True) to the __init__ method of LigerRMSNorm.
  • When elementwise_affine is set to False, self.weight is registered as None, preventing the allocation of unnecessary parameters.
  • Updated extra_repr to include the elementwise_affine status in the printed model structure.
  • The parameter is added at the end of the argument list to maintain backward compatibility for subclasses (e.g., LigerRMSNormForGemma) that rely on positional arguments during super().__init__ calls.

Testing Done

  • Hardware Type: NVIDIA A100-SXM4-80GB
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

@niyunsheng niyunsheng changed the title add elementwise_affine param for rms norm [Feature] Add elementwise_affine argument to LigerRMSNorm Dec 24, 2025
@niyunsheng niyunsheng requested a review from Tcc0403 December 24, 2025 14:17
Copy link
Copy Markdown
Collaborator

@Tcc0403 Tcc0403 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@Tcc0403 Tcc0403 merged commit 87d7057 into linkedin:main Dec 24, 2025
3 of 7 checks passed
@niyunsheng niyunsheng deleted the elementwise_affine_false_rebase branch December 25, 2025 00:42
Tcc0403 pushed a commit that referenced this pull request Dec 25, 2025
…atched models (#990)

## Summary

Fixes an AttributeError encountered in `LigerRMSNorm.extra_repr` when
models are patched `in-place` (e.g., using
`apply_liger_kernel_to_model`).

In my previous PR (#989 ), I added `elementwise_affine` to `extra_repr`
to improve layer visibility.
However, when layers are replaced via monkey patching, the
`LigerRMSNorm` constructor is typically skipped, leaving the instance
without the `elementwise_affine` attribute.


## Testing Done

I verified the fix locally. The issue is reproducible via
`test/transformers/test_monkey_patch.py`. After applying this fix,
`test/transformers/test_monkey_patch.py` passes successfully.

- Hardware Type: NVIDIA A100-SXM4-80GB
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [x] run `make test-convergence` to ensure convergence
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants