Skip to content

support string stop#7587

Open
chang-wenbin wants to merge 1 commit intoPaddlePaddle:developfrom
chang-wenbin:input_stop
Open

support string stop#7587
chang-wenbin wants to merge 1 commit intoPaddlePaddle:developfrom
chang-wenbin:input_stop

Conversation

@chang-wenbin
Copy link
Copy Markdown
Collaborator

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 23, 2026

Thanks for your contribution!

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-23 16:52:49

📋 Review 摘要

PR 概述:新增字符串级别的 stop string 检查功能,支持在生成结束时截断输出文本并控制是否保留 stop string。
变更范围input/base_processor.pyengine/sampling_params.pymodel_executor/stop_string_checker.py
影响面 Tag[DataProcessor] [Engine] [APIServer]


📝 PR 规范检查

PR 标题 support string stop 缺少官方 [Tag],且 Motivation/Modifications 未填写。

标题建议(可直接复制):

  • [Feature] Support string-level stop sequence matching in generation output

描述建议(可直接复制):

## Motivation
当前 FastDeploy 仅支持 token-level 的 stop 检测(stop_token_ids),存在 BPE 边界问题导致 stop string 无法精确匹配。本 PR 新增字符串级别的 stop sequence 检查,并支持通过 `include_stop_str_in_output` 控制是否在输出中保留 stop string。

## Modifications
- `sampling_params.py`:新增 `include_stop_str_in_output` 字段
- `stop_string_checker.py`:新增 `check_stop_strings` 函数及 `StopStringChecker` 类
- `base_processor.py`:在非流式和流式响应处理中集成 stop string 截断逻辑

问题

级别 文件 概述
🔴 Bug base_processor.py:268 include_stop_str_in_output 未从 request 中读取,导致用户设置不生效
🔴 Bug base_processor.py:353 流式模式下截断计算可能产生负数偏移,导致 delta_text 截断结果错误
🟡 建议 stop_string_checker.py:68 StopStringChecker 类为死代码,未在任何地方被使用
🟡 建议 新增 210 行核心逻辑,缺少对应单元测试

🔴 Bug 1:include_stop_str_in_output 未从 request 读取

位置base_processor.pyprocess_response_dict_normalprocess_response_dict_streaming

问题

# PR 新增代码
stop_strs = getattr(request, "stop", None)        # ✅ 从 request 读取
include_stop_str_in_output = kwargs.get("include_stop_str_in_output", False)  # ❌ 只从 kwargs 读取

stop_strsrequest.stop 中读取,但 include_stop_str_in_output 却仅从 kwargs 获取,而调用方(engine.py:787async_llm.py:544)传入的是硬编码的 include_stop_str_in_output=False。这导致:即使用户在请求体中设置了 include_stop_str_in_output=True,该标志对 stop string 截断逻辑不生效,stop string 始终会被从输出中删除。

建议修复

# 优先从 request 中读取,fallback 到 kwargs
if request is not None:
    include_stop_str_in_output = getattr(request, "include_stop_str_in_output", None)
    if include_stop_str_in_output is None:
        include_stop_str_in_output = kwargs.get("include_stop_str_in_output", False)
else:
    include_stop_str_in_output = kwargs.get("include_stop_str_in_output", False)

🔴 Bug 2:流式模式截断偏移可能为负数

位置base_processor.pyprocess_response_dict_streaming 新增代码块

问题

if truncate_to != -1:
    delta_text = delta_text[: truncate_to - len(previous_texts)]
    if delta_text:
        pass  # Keep token_ids as-is, just truncate text
    else:
        delta_text = ""

当 stop string 横跨 previous_textsdelta_text 的边界,且截断点落在 previous_texts 内部时,truncate_to - len(previous_texts)负数。Python 中对字符串做负数切片(如 s[:-3])不会报错,但语义是"从末尾往前数",与预期的"截到某个绝对位置"完全不同,会导致输出文本错误。

另外,此时已经流式发出的 previous_texts 部分无法被客户端回撤,这是流式 stop string 检测的固有局限,应在注释或文档中明确说明。

建议修复

if truncate_to != -1:
    truncate_in_delta = truncate_to - len(previous_texts)
    if truncate_in_delta <= 0:
        # Stop point is in already-sent text; best effort: send empty delta
        delta_text = ""
    else:
        delta_text = delta_text[:truncate_in_delta]

🟡 建议 1:StopStringChecker 类为死代码

位置stop_string_checker.py 第 68–210 行

StopStringChecker 类定义了约 140 行代码,但在本 PR 的所有调用点(base_processor.py)均直接使用了独立函数 check_stop_strings,该类完全没有被引用。建议:

  • 若后续有使用计划,保留并在 docstring 中注明预期用途;
  • 若无使用计划,删除以避免维护成本。

🟡 建议 2:缺少单元测试

新增了 stop_string_checker.py 约 210 行核心逻辑,包含边界检查、增量匹配等复杂逻辑,但 PR 中未包含任何单元测试。建议在 tests/ 下补充:

  • check_stop_strings 的基础功能测试(精确匹配、前缀匹配、多 stop string、空输入)
  • include_in_output=True/False 两种模式的截断位置验证
  • 跨边界检测场景(stop string 横跨 previous_textsdelta_text

总体评价

PR 方向正确,字符串级别的 stop string 检测能有效规避 BPE 边界问题。但存在两处 P0 Bug(include_stop_str_in_output 未从 request 读取、流式截断偏移计算错误),可能导致用户配置不生效或输出文本异常,需修复后再合入。

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 18.51852% with 88 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@5e87930). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/model_executor/stop_string_checker.py 14.28% 60 Missing ⚠️
fastdeploy/input/base_processor.py 24.32% 23 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7587   +/-   ##
==========================================
  Coverage           ?   72.40%           
==========================================
  Files              ?      420           
  Lines              ?    57848           
  Branches           ?     9082           
==========================================
  Hits               ?    41885           
  Misses             ?    13128           
  Partials           ?     2835           
Flag Coverage Δ
GPU 72.40% <18.51%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants