[XPU] Remove v0 cache schedule and use InputBatch by cmcamdy · Pull Request #7603 · PaddlePaddle/FastDeploy

cmcamdy · 2026-04-23T16:26:07Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-23T16:26:15Z

Thanks for your contribution!

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-24 02:03:18

📋 Review 摘要

PR 概述：移除 XPU 平台的 V0 KVCache 调度器路径，统一使用 V1 KVCache 调度器，并同步将 pre_ids 缓冲区替换为 token_ids_all。
变更范围：xpu_model_runner、xpu_pre_and_post_process、apply_penalty_multi_scores、sampler、input_batch、spec_decode/mtp
影响面 Tag：XPU KVCache Scheduler Speculative Decoding

📝 PR 规范检查

PR 标题包含有效 [XPU] Tag，符合规范。但 Motivation 和 Modifications 章节均未填写，描述完全为空，建议补全以便后续追溯。

描述建议（可直接复制）：

## Motivation
移除 XPU 平台对 V0 KVCache Scheduler 的支持，FastDeploy 2.6 及以上版本 XPU 统一使用 V1 KVCache Scheduler，简化代码路径，减少维护负担。

## Modifications
- `xpu_model_runner.py`：删除 `_init_share_inputs` 方法，改用 `InputBatch` 统一初始化；`insert_prefill_inputs` 替换为 `NotImplementedError` 防止 V0 误用；移除 `_prepare_inputs` 中的 `ENABLE_V1_KVCACHE_SCHEDULER` 条件
- `xpu_pre_and_post_process.py`：删除 `step_xpu` 函数（V0 调度入口）；`update_inputs` 分支统一为 `update_inputs_v1`
- `apply_penalty_multi_scores.py` / `sampler.py`：XPU 路径将 `pre_token_ids` 替换为 `token_ids_all`
- `input_batch.py`：XPU 平台加入 CUDA 同一分支，统一通过 `token_ids_all` 派生 `pre_ids`

问题

级别	文件	概述
🟡 建议	`xpu_model_runner.py:751`	`xpu_worker.py` 未同步清理，存在死代码分支
🟡 建议	—	未添加单元测试或回归测试

总体评价

PR 的核心逻辑清晰正确：通过 InputBatch 统一初始化、删除 step_xpu 函数、将 pre_ids 路径收敛到 token_ids_all，整体方向合理。主要遗漏是 xpu_worker.py 未同步更新（preprocess_new_task 中仍有调用 insert_prefill_inputs 的 else 分支，已成死代码），以及 PR 描述为空。建议补充描述并处理 xpu_worker.py 的遗留分支后合入。

codecov-commenter · 2026-04-23T19:16:36Z

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@029c0d6). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/worker/input_batch.py	0.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7603   +/-   ##
==========================================
  Coverage           ?   71.73%           
==========================================
  Files              ?      419           
  Lines              ?    57783           
  Branches           ?     9060           
==========================================
  Hits               ?    41448           
  Misses             ?    13512           
  Partials           ?     2823

Flag	Coverage Δ
GPU	`71.73% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cmcamdy added 2 commits April 23, 2026 16:20

rm v0 cache schedule

1ce2f74

rm v0 cache schedule

29b8101

cmcamdy had a problem deploying to Metax_ci April 23, 2026 16:26 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

cmcamdy added 2 commits April 23, 2026 17:02

use InputBatch to create shared_input

e6d5078

Merge shared_inputs

2908879

cmcamdy temporarily deployed to Metax_ci April 23, 2026 17:49 — with GitHub Actions Inactive

cmcamdy changed the title ~~[XPU] rm v0 cache schedule~~ [XPU] rm v0 cache schedule and use InputBatch Apr 23, 2026

cmcamdy changed the title ~~[XPU] rm v0 cache schedule and use InputBatch~~ [XPU] Remove v0 cache schedule and use InputBatch Apr 23, 2026

PaddlePaddle-bot reviewed Apr 23, 2026

View reviewed changes

Comment thread fastdeploy/worker/xpu_model_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[XPU] Remove v0 cache schedule and use InputBatch#7603

[XPU] Remove v0 cache schedule and use InputBatch#7603
cmcamdy wants to merge 4 commits intoPaddlePaddle:developfrom
cmcamdy:rm_v0_cache_schedule

cmcamdy commented Apr 23, 2026

Uh oh!

paddle-bot Bot commented Apr 23, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

Uh oh!

codecov-commenter commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cmcamdy commented Apr 23, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented Apr 23, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

Uh oh!

codecov-commenter commented Apr 23, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants