feat: add RainFusion sparse attention for wan2.2. by ethan686 · Pull Request #1867 · jd-opensource/xllm

ethan686 · 2026-07-01T10:10:07Z

add two new aclnn op: aclnnRainFusionAttention and aclnnBlockSparseAttention(for future A5, currently not actual used.)
4 new dit configs:
a. rainfusion_enabled , bool, whethen enable rainfusion attention or not.
b. rainfusion_sparsity , double 0.5~0.9 ; with value higher, faster and lower percision.
c. rainfusion_pool_size , int , block size default 128.
d. rainfusion_sparse_start_step , int, decide which step to start use sparsity ; with higher value faster and lower percision.
with or without sparsity=0.8 , start_step=15: mindiesd percision compare result is 0.9333, xllm percision compare result is 0.9555

…r Wan2.2 Replace RainFusion V3 with V2 block sparse attention using frame-pairing approach and aclnnRainFusionAttention kernel. Key changes: - Add npu_rain_fusion_attention wrapping aclnnRainFusionAttention - Add npu_block_sparse_attention support - Refactor V2 rearrange/inverse using merge B*N approach, split into two stages to avoid >8 dim tensors on NPU - Cat QKV before rearrange for 1 call instead of 3 - Fix inverse rearrange output order (first_frame before rest) - Fix transpose selectIdx per aclnn spec, capture N/D dims before reshape - Perf: explicit contiguous before to_flat/from_flat, use view not reshape - Simplify RainFusion V2 config, remove V3-only flags - Add roundtrip check (rearrange + inverse) to verify correctness - Add dit flags into global_config.h

gemini-code-assist

Code Review

This pull request introduces RainFusionV2 block-wise sparse attention for the Wan video generation model, adding new NPU kernels (npu_block_sparse_attention and npu_rain_fusion_attention), configuration flags, and integration into the transformer and pipeline. Feedback on the changes includes removing a leftover debug statement containing an absolute local path in the pipeline, eliminating redundant optional copies in the block sparse attention kernel, and resolving an anonymous namespace anti-pattern in the rainfusion_attention.h header. Additionally, several style guide violations need to be addressed, specifically replacing at:: namespace usage with torch:: and replacing functional/C-style casts with static_cast or appropriate literals.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

ethan686 requested review from Clement-Wang26, DongheJin, DragonFive, JimHsiung, Kang-Meng, RobbieLeung, XuZhang99, liujinguang0125, liutongxuan, walsonyang, xiao-yu-chen, yingxudeng, yq33victor and zhang-minchao as code owners July 1, 2026 10:10

gemini-code-assist Bot reviewed Jul 1, 2026

View reviewed changes

fix: resolve the gemini issue.

80fa6a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add RainFusion sparse attention for wan2.2.#1867

feat: add RainFusion sparse attention for wan2.2.#1867
ethan686 wants to merge 2 commits into
jd-opensource:mainfrom
ethan686:wan22_dev

ethan686 commented Jul 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ethan686 commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ethan686 commented Jul 1, 2026 •

edited

Loading