Shared Memory GEMM by elvingerpaul · Pull Request #14 · eth-easl/vllm_profile

elvingerpaul · 2025-11-17T08:23:43Z

add paper shared memory paper experiment for GEMM

github-actions · 2025-11-17T08:23:53Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

* intra_sm/ipc v1 release * intra_sm/ipc v2 release * intra_sm/shared_mem v1 release * intra_sm/shared_mem v2 release * intra_sm/ipc v3 release * intra_sm/tb_scheduler v1 release * clean up duplicated files * inter_sm/l2_cache v1 * inter_sm/mem_bw * delete unused files in inter_sm * cleanup interference kernels * improve logs * verified intra_sm/ipc scripts * verified inter_sm/shared_mem * specify GPU type for IPC experiment * verified intra_sm/tb_scheduler * update README * inter_sm/l2_cache v2 * bug fix: use cuda.synchronize in case there is no interfernece kernel * README v1 * bug fixes * Shared Memory GEMM (#14) * add scripts for gemm shared memory interference * split up shared_mem into llm and gemm subrepo * intra_sm/shared_mem/gemm v1 * Merge gcontext_test.py and main.py (#15) * membw to use main.py instead of gcontext_test.py * fix bugs related to num_requests and num_threads_per_tb * inter_sm/mem_bw verified * remove gcontext_test.py, new universal entrypoint is main.py * fix inter_sm/l2 to use L2Kernel * fix, missing set_percentage arg * fix num_warmup vs num_request * minor adjustment README ipc * add requirements.txt file * remove custom profiling folder * main README * vllm/interference * inter_sm/l2_cache final * inter_sm/mem_bw final * intra_sm/ipc final * intra_sm shared mem gemm * intra-sm shared mem llm * intra sm tb scheduler final

elvingerpaul added 3 commits November 16, 2025 18:14

add scripts for gemm shared memory interference

6813a17

split up shared_mem into llm and gemm subrepo

7226ef6

intra_sm/shared_mem/gemm v1

f4ad1ab

elvingerpaul merged commit e73032e into release-repo-socc25 Nov 17, 2025
1 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Shared Memory GEMM#14

Shared Memory GEMM#14
elvingerpaul merged 3 commits into
release-repo-socc25from
shared-memory-gemm

elvingerpaul commented Nov 17, 2025

Uh oh!

github-actions Bot commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

elvingerpaul commented Nov 17, 2025

Uh oh!

github-actions Bot commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant