Skip to content

Enable FusedSDPA slicing if chunked-prefill is enabled or max_model_len > 32k #2272

Merged
czhu15 merged 1 commit into
HabanaAI:aice/v1.22.0from
yangulei:slice_long_seq
Jun 5, 2026
Merged

Enable FusedSDPA slicing if chunked-prefill is enabled or max_model_len > 32k #2272
czhu15 merged 1 commit into
HabanaAI:aice/v1.22.0from
yangulei:slice_long_seq

Conversation

@yangulei

@yangulei yangulei commented Jun 5, 2026

Copy link
Copy Markdown

Auto enable FusedSDPA slicing to avoid OOM for long context benchmark.

@yangulei

yangulei commented Jun 5, 2026

Copy link
Copy Markdown
Author

@Wei-Lin-Intel @czhu15
Please help to review, thanks!

@czhu15 czhu15 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@czhu15 czhu15 merged commit 86e45f6 into HabanaAI:aice/v1.22.0 Jun 5, 2026
2 checks passed
@yangulei yangulei deleted the slice_long_seq branch June 5, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants