Enable SwiGLU patching for Qwen3-VL by dongchany · Pull Request #1175 · linkedin/Liger-Kernel

dongchany · 2026-03-31T16:07:05Z

Summary

Fix apply_liger_kernel_to_qwen3_vl(..., swiglu=True) so it is no longer a no-op.

Changes

patch transformers.models.qwen3_vl.modeling_qwen3_vl.Qwen3VLTextMLP to LigerSwiGLUMLP
patch instantiated decoder_layer.mlp modules for existing Qwen3-VL model instances
enable swiglu=True by default for apply_liger_kernel_to_qwen3_vl
add monkey-patch tests covering both default instance patching and explicit swiglu=True
document Qwen3-VL support in the README table

Validation

python -m pytest -q test/transformers/test_monkey_patch.py -k 'qwen3_vl and not moe'
python -m pytest -q test/convergence/bf16/test_mini_models.py -k 'mini_qwen3_vl and test_mini_model'
python -m ruff check src/liger_kernel/transformers/monkey_patch.py test/transformers/test_monkey_patch.py README.md

Copilot

Pull request overview

This PR fixes Qwen3-VL SwiGLU patching so apply_liger_kernel_to_qwen3_vl(..., swiglu=True) is effective (including for already-instantiated model instances), adds regression tests, and documents Qwen3-VL support.

Changes:

Enable SwiGLU patching for Qwen3-VL by default and patch Qwen3VLTextMLP / existing decoder_layer.mlp instances.
Add/extend monkey-patch tests to assert MLP forward patching behavior for Qwen3-VL.
Update the README supported-models table to include Qwen3-VL.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`src/liger_kernel/transformers/monkey_patch.py`	Implements Qwen3-VL SwiGLU patching (module class + existing instances) and flips the default `swiglu` behavior.
`test/transformers/test_monkey_patch.py`	Adds assertions and a new test covering Qwen3-VL MLP patching via the `swiglu` flag.
`README.md`	Adds Qwen3-VL to the support matrix and adjusts Qwen-family rows.

Comments suppressed due to low confidence (1)

src/liger_kernel/transformers/monkey_patch.py:1799

The apply_liger_kernel_to_qwen3_vl docstring’s Args: section doesn’t document the rope parameter even though it is part of the public signature (and is patched inside the function). Please add an rope (bool): ... entry for clarity and to keep it consistent with the other apply_liger_kernel_to_* docstrings in this module.

    """
    Apply Liger kernels to replace original implementation in HuggingFace Qwen3-VL models.

    Args:
        cross_entropy (bool): Whether to apply Liger's cross entropy loss. Default is False.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-31T16:13:14Z

README.md

+| Qwen2, Qwen2.5, & QwQ      | `liger_kernel.transformers.apply_liger_kernel_to_qwen2`    | RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy        |
+| Qwen2-VL, & QVQ       | `liger_kernel.transformers.apply_liger_kernel_to_qwen2_vl`    | RMSNorm, LayerNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy        |
+| Qwen2.5-VL       | `liger_kernel.transformers.apply_liger_kernel_to_qwen2_5_vl`    | RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy        |
+| Qwen3-VL       | `liger_kernel.transformers.apply_liger_kernel_to_qwen3_vl`    | RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy        |
+| Qwen3   | `liger_kernel.transformers.apply_liger_kernel_to_qwen3`    |  RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy       |
+| Qwen3 MoE | `liger_kernel.transformers.apply_liger_kernel_to_qwen3_moe` | RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy       |


This README table section appears to introduce mixed line endings (surrounding lines are CRLF, while the newly added/edited Qwen rows show as LF-only). Please normalize the line endings for the file (and ideally enforce via .editorconfig) to avoid noisy diffs and potential formatting issues on Windows tooling.

Enable SwiGLU patching for Qwen3-VL

c36e542

Copilot AI review requested due to automatic review settings March 31, 2026 16:07

Copilot started reviewing on behalf of dongchany March 31, 2026 16:07 View session

dongchany mentioned this pull request Mar 31, 2026

[Feature Request] Add SwiGLU support for Qwen3-VL models #956

Open

Copilot AI reviewed Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable SwiGLU patching for Qwen3-VL#1175

Enable SwiGLU patching for Qwen3-VL#1175
dongchany wants to merge 1 commit intolinkedin:mainfrom
dongchany:qwen3-vl-swiglu

dongchany commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dongchany commented Mar 31, 2026

Summary

Changes

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants