Skip to content

[Feat] DeepSeek V4 Rebased #40860

Merged
ywang96 merged 25 commits into
vllm-project:mainfrom
ivanium:feat/dsv4-support
Apr 27, 2026
Merged

[Feat] DeepSeek V4 Rebased #40860
ywang96 merged 25 commits into
vllm-project:mainfrom
ivanium:feat/dsv4-support

Conversation

@ivanium

@ivanium ivanium commented Apr 25, 2026

Copy link
Copy Markdown
Collaborator

Purpose

Rebased version of #40760

Roadmap: #40902

Co-authored by: Bugen Zhao, Giancarlo Delfin, Jie Li, Kaichao You, Roy Wang, Woosuk Kwon, Yifan Qiao, Yongye Zhu, Zhewen Li, Zijing Liu, Zixi Qi

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

return Mxfp4MoeBackend.NONE, None


def select_mxfp4_moe_backend(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we shouldn't create separate select_gpt_oss_mxfp4_moe_backend and select_mxfp4_moe_backend

what's the reason that these two can't be merged?

cc @mgoin , @robertgshaw2-redhat

This was referenced Apr 28, 2026
Mxfp4MoeBackend.TRITON_UNFUSED,
# TRITON_UNFUSED has bug with MTP support
# TODO re-enable after kernel is fixed
# TRITON_UNFUSED

@fxmarty-amd fxmarty-amd Jun 24, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any context on this @ivanium ? Which test specifically dispatched on TRITON_UNFUSED and caused failures?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions. cc @zyongye and @jeejeelee who have more details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build cpu Related to CPU backends deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models kv-connector new-model Requests to new models nvidia ready-run-all-tests Trigger CI with all tests for wide-ranging PRs speculative-decoding tool-calling v1 verified Run pre-commit for new contributors without triggering other tests

Projects

Status: Done
Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.