[TorchToTMTenor] enable broadcasting query- and key_seq_len of attention mask#4464
Open
[TorchToTMTenor] enable broadcasting query- and key_seq_len of attention mask#4464
Conversation
Groverkss
approved these changes
Feb 10, 2026
Author
|
@Groverkss Hi, could you enable the CI run here? i don't have any permissions. |
Member
The CI looks like it ran. Do you need any help there? Also happy to get you write/traige access to torch-mlir. |
fff46e5 to
ed27d33
Compare
Author
|
las time the CI ran it failed downloading some packages, not sure what this is about. I just rebased to latest main and pushed again and CI is blocked again. So it would be nice if you could run it again. Either way, it would be very nice if i could have access to run the CI myself. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I ran into a problem where i would get a attention mask of shape [1,1,1,1] which would need to be broadcasted along the
key_seq_lendimension which is not possible with the current implementation. This led to verification errors down the line.This change enables broadcasting the
query_seq_lenandkey_seq_lendimension of the attention mask if required.