Skip to content

Fix inf grad_norm on Qwen3.5 at seq_len > 65536 without flash-attn#582

Open
danielhanchen wants to merge 1 commit intomainfrom
fix/issue-4906-qwen35-sdpa-bool-mask
Open

Fix inf grad_norm on Qwen3.5 at seq_len > 65536 without flash-attn#582
danielhanchen wants to merge 1 commit intomainfrom
fix/issue-4906-qwen35-sdpa-bool-mask

Commits

Commits on Apr 9, 2026