Skip to content

Merge gcontext_test.py and main.py#15

Merged
elvingerpaul merged 8 commits into
release-repo-socc25from
merge-inter-intra-sm-entrypoints
Nov 19, 2025
Merged

Merge gcontext_test.py and main.py#15
elvingerpaul merged 8 commits into
release-repo-socc25from
merge-inter-intra-sm-entrypoints

Conversation

@elvingerpaul

Copy link
Copy Markdown
Collaborator

FILL IN THE PR DESCRIPTION HERE

FIX #xxxx (link existing issues this PR will resolve)

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing/overview.html (anything written below this line will be removed by GitHub Actions)

@github-actions

Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@elvingerpaul elvingerpaul merged commit b17519d into release-repo-socc25 Nov 19, 2025
1 of 4 checks passed
elvingerpaul added a commit that referenced this pull request Nov 20, 2025
* intra_sm/ipc v1 release

* intra_sm/ipc v2 release

* intra_sm/shared_mem v1 release

* intra_sm/shared_mem v2 release

* intra_sm/ipc v3 release

* intra_sm/tb_scheduler v1 release

* clean up duplicated files

* inter_sm/l2_cache v1

* inter_sm/mem_bw

* delete unused files in inter_sm

* cleanup interference kernels

* improve logs

* verified intra_sm/ipc scripts

* verified inter_sm/shared_mem

* specify GPU type for IPC experiment

* verified intra_sm/tb_scheduler

* update README

* inter_sm/l2_cache v2

* bug fix: use cuda.synchronize in case there is no interfernece kernel

* README v1

* bug fixes

* Shared Memory GEMM (#14)

* add scripts for gemm shared memory interference

* split up shared_mem into llm and gemm subrepo

* intra_sm/shared_mem/gemm v1

* Merge gcontext_test.py and main.py (#15)

* membw to use main.py instead of gcontext_test.py

* fix bugs related to num_requests and num_threads_per_tb

* inter_sm/mem_bw verified

* remove gcontext_test.py, new universal entrypoint is main.py

* fix inter_sm/l2 to use L2Kernel

* fix, missing set_percentage arg

* fix num_warmup vs num_request

* minor adjustment README ipc

* add requirements.txt file

* remove custom profiling folder

* main README

* vllm/interference

* inter_sm/l2_cache final

* inter_sm/mem_bw final

* intra_sm/ipc final

* intra_sm shared mem gemm

* intra-sm shared mem llm

* intra sm tb scheduler final
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant