[ExecuTorch][WebGPU] Add et_vk.apply_rotary_emb (interleaved RoPE) + ValueList multi-output by JulianCloudNTH · Pull Request #20264 · pytorch/executorch

JulianCloudNTH · 2026-06-13T00:08:51Z

Stack from ghstack (oldest at bottom):

[ExecuTorch][WebGPU] Add et_vk.prepack (constant-tensor packing) for E2E weight loading #20265
-> [ExecuTorch][WebGPU] Add et_vk.apply_rotary_emb (interleaved RoPE) + ValueList multi-output #20264
[ExecuTorch][WebGPU] Add et_vk.embedding_q4gsw (4-bit groupwise-symmetric quantized embedding) #20263
[ExecuTorch][WebGPU] linear_q4gsw test suite: Llama-1B shapes + 4k/8k sweep #20227

Adds the WebGPU backend handler for et_vk.apply_rotary_emb.default (interleaved Llama rotary positional embedding) plus the ValueList graph-value support its multi-output signature requires.

The op rotates the query and key tensors by a shared freqs_cos/freqs_sin pair and is composed of two dispatches of one WGSL kernel: each thread handles one (even, odd) element pair of a head row (out[2i] = x[2i]*cos - x[2i+1]*sin, out[2i+1] = x[2i]*sin + x[2i+1]*cos), one dispatch writing xq_out and one writing xk_out, mirroring the Vulkan apply_rotary_emb reference (buffer-only, fp32, the interleaved .default variant). Each dispatch owns a distinct compute pipeline (the graph destructor releases per dispatch, so a shared handle would double-free); the workgroup size is a wg_size pipeline-override constant clamped to the device limit, both 1D dispatch counts go through WebGPUUtils::compute_1d_workgroup_count and are validated before any GPU-object allocation, and the embedded WGSL header is generated by gen_wgsl_headers.py.

The two outputs (xq_out, xk_out) are serialized by the Vulkan exporter as a single ValueList graph value, which the runtime did not previously model. This adds the ValueType::ValueList value kind, a value_lists_ table populated during build(), and a get_value_list accessor the handler uses to resolve the output ids. While in that code path it also closes a latent gap: a constant tensor whose constant_id is set but whose constants table is missing or out of range now throws (fail-loud) rather than silently leaving the buffer uninitialized.

Differential Revision: D108428756

[ghstack-poisoned]

pytorch-bot · 2026-06-13T00:08:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20264

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[ROCm] MI350 CI jobs will have longer queue times due to CI migration

❌ 22 New Failures, 1 Unrelated Failure

As of commit f2d1ae0 with merge base 5526971 ():

NEW FAILURES - The following jobs have failed:

pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t b718b2aad5174a079ca73007ee81c447c8d7dbce3e92bb349c74c829a45403a8 /exec failed with exit code 1
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t 14eb847102330e7eb0327e7925b5b80c324c7686b11d9fdf3eadf9514f0cfb95 /exec failed with exit code 1
pull / test-lora-multimethod-linux / linux-job (gh)
RuntimeError: Command docker exec -t 108c5cddb6ce059fff359e2ad588913aa86991f4da833daa66aceb94e35d3f44 /exec failed with exit code 1
pull / test-qnn-buck-build-linux / linux-job (gh)
RuntimeError: Command docker exec -t 364aeb92d3feebbc88524d586c25780aa388f9f0f5b7188b7bbaced09abc9cc1 /exec failed with exit code 1
pull / test-qnn-delegate-linux / linux-job (gh)
RuntimeError: Command docker exec -t 08a7ce9a1205252bbbcb6c6a059f951d0aa4e85dda5d97394eea9ea4aeff1a1d /exec failed with exit code 1
pull / test-qnn-direct-build-linux / linux-job (gh)
RuntimeError: Command docker exec -t e4cda5a4eae683182346199fc8f140bf728cf35130a85b4f21a9fb89e706b17c /exec failed with exit code 1
pull / test-qnn-models-linux (dl3) / linux-job (gh)
RuntimeError: Command docker exec -t 26f0fd649cee76a6dd3e7ab77b422b5c74c2a131132c0a58008df5e64dfecb5c /exec failed with exit code 1
pull / test-qnn-models-linux (mv2) / linux-job (gh)
RuntimeError: Command docker exec -t a6c087fa755a781eeb85007527dc9f6fb99a58f6fd768c39266e93751fdb60d4 /exec failed with exit code 1
pull / test-qnn-models-linux (mv3) / linux-job (gh)
RuntimeError: Command docker exec -t ea5964d84d90807186320539f041f5e8ed9b53a33deca01da2e72514973e816a /exec failed with exit code 1
pull / test-qnn-passes-linux / linux-job (gh)
RuntimeError: Command docker exec -t 960664baf4b9d1303b8a083f5090beb904f71a8c52358740f5981e4ad63c05d4 /exec failed with exit code 1
pull / test-qnn-python-imports-linux / linux-job (gh)
RuntimeError: Command docker exec -t 4f22b68e75a298bb8fa2703e6c4cb6f2a7a96aaa1e30f500dc060e17aa02106c /exec failed with exit code 1
pull / test-qnn-testsuite-linux / test-backend-linux (qnn, models) / linux-job (gh)
RuntimeError: Command docker exec -t 1308f81b237f0164e88836a3f9436f13e28d6a7667f043dab42796963133783b /exec failed with exit code 1
pull / test-qnn-testsuite-linux / test-backend-linux (qnn, operators) / linux-job (gh)
RuntimeError: Command docker exec -t 8c5821165f79b07243a362277a42ad07b92f71198c2fb06885b5fe6d09bf4469 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.10) / linux-job (gh)
RuntimeError: Command docker exec -t ff887dc6bf69a181fb18a70773fd1c3082e94bae311deefadb4fc612a2ab68b7 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.11) / linux-job (gh)
RuntimeError: Command docker exec -t da5e0679ded2fed7206bb251b164b7ef7e462c0efd236105c4e3c51e599fcd65 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.12) / linux-job (gh)
RuntimeError: Command docker exec -t 2ae489a1cfbe045b8dbf7eb59208e94528f05579751bc8ce2578f11bbec48c35 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.13) / linux-job (gh)
RuntimeError: Command docker exec -t 1cd1c63e51cf8e73b5e555bd060c9a1c86d641cf47563d25976e7effb935b23f /exec failed with exit code 1
pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
RuntimeError: Command docker exec -t fcd74fb78df959233a2fb0c78aacd800f029d697568abce82705538ff5b8ed6d /exec failed with exit code 1
pull / test-static-llama-qnn-linux (stories_110m) / linux-job (gh)
RuntimeError: Command docker exec -t 91d412d1dda82680e0c7c1426c628f37d4fc48e4c5b8912d02ac70ebdace121b /exec failed with exit code 1
pull / test-static-llama-qnn-linux (stories_260k_bc) / linux-job (gh)
RuntimeError: Command docker exec -t 4e37c548dfbe26568956966c56a1ba50bbe5cfbb99b1aa139115b46eaf1a29bb /exec failed with exit code 1
Test QNN Backend / test-qnn / test-backend-linux (qnn, models) / linux-job (gh)
RuntimeError: Command docker exec -t d4b4adefa9f0652de1c2b4ae9e69c582472d04777f90e2696068914f82fde9c1 /exec failed with exit code 1
Test QNN Backend / test-qnn / test-backend-linux (qnn, operators) / linux-job (gh)
RuntimeError: Command docker exec -t 8f659bb60fee78783db3dc37dfcadafb35ed25c3c5c91bdb58672a237e4e6119 /exec failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / android / build-android (gh) (trunk failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-06-13T00:10:07Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Update

f2d1ae0

[ghstack-poisoned]

JulianCloudNTH requested review from kirklandsign and larryliu0820 as code owners June 13, 2026 00:08

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ExecuTorch][WebGPU] Add et_vk.apply_rotary_emb (interleaved RoPE) + ValueList multi-output#20264

[ExecuTorch][WebGPU] Add et_vk.apply_rotary_emb (interleaved RoPE) + ValueList multi-output#20264
JulianCloudNTH wants to merge 1 commit into
gh/JulianCloudNTH/26/basefrom
gh/JulianCloudNTH/26/head

JulianCloudNTH commented Jun 13, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JulianCloudNTH commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20264

❗ 1 Active SEVs

❌ 22 New Failures, 1 Unrelated Failure

Uh oh!

github-actions Bot commented Jun 13, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JulianCloudNTH commented Jun 13, 2026 •

edited

Loading

pytorch-bot Bot commented Jun 13, 2026 •

edited

Loading

This PR needs a `release notes:` label