Skip to content

[TorchToTMTenor] enable broadcasting query- and key_seq_len of attention mask#4464

Open
ziereis wants to merge 2 commits intollvm:mainfrom
ziereis:fix_broadcast_attention_mask
Open

[TorchToTMTenor] enable broadcasting query- and key_seq_len of attention mask#4464
ziereis wants to merge 2 commits intollvm:mainfrom
ziereis:fix_broadcast_attention_mask

Conversation

@ziereis
Copy link
Copy Markdown

@ziereis ziereis commented Feb 8, 2026

I ran into a problem where i would get a attention mask of shape [1,1,1,1] which would need to be broadcasted along the key_seq_len dimension which is not possible with the current implementation. This led to verification errors down the line.

This change enables broadcasting the query_seq_len and key_seq_len dimension of the attention mask if required.

@ziereis
Copy link
Copy Markdown
Author

ziereis commented Feb 10, 2026

@Groverkss Hi, could you enable the CI run here? i don't have any permissions.

@Groverkss
Copy link
Copy Markdown
Member

@Groverkss Hi, could you enable the CI run here? i don't have any permissions.

The CI looks like it ran. Do you need any help there? Also happy to get you write/traige access to torch-mlir.

@ziereis ziereis force-pushed the fix_broadcast_attention_mask branch from fff46e5 to ed27d33 Compare March 8, 2026 13:17
@ziereis
Copy link
Copy Markdown
Author

ziereis commented Mar 8, 2026

las time the CI ran it failed downloading some packages, not sure what this is about. I just rebased to latest main and pushed again and CI is blocked again. So it would be nice if you could run it again.

Either way, it would be very nice if i could have access to run the CI myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants