Skip to content

[FDConfig] 默认开启 FD_ENABLE_E2W_TENSOR_CONVERT 和 FD_ENGINE_TASK_QUEUE_WITH_SHM#7572

Open
sunlei1024 wants to merge 1 commit intoPaddlePaddle:developfrom
sunlei1024:feat/default-enable-shm-tensor
Open

[FDConfig] 默认开启 FD_ENABLE_E2W_TENSOR_CONVERT 和 FD_ENGINE_TASK_QUEUE_WITH_SHM#7572
sunlei1024 wants to merge 1 commit intoPaddlePaddle:developfrom
sunlei1024:feat/default-enable-shm-tensor

Conversation

@sunlei1024
Copy link
Copy Markdown
Collaborator

Motivation

默认开启 FD_ENABLE_E2W_TENSOR_CONVERT 和 FD_ENGINE_TASK_QUEUE_WITH_SHM

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 22, 2026

Thanks for your contribution!

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-22 22:32:03

📋 Review 摘要

PR 概述:将 FD_ENABLE_E2W_TENSOR_CONVERTFD_ENGINE_TASK_QUEUE_WITH_SHM 两个环境变量的默认值从 0(关闭)改为 1(开启),使其在未显式配置时默认生效。
变更范围fastdeploy/envs.py — 环境变量默认值
影响面 Tag[FDConfig]


📝 PR 规范检查

PR 标题合规,但描述中 Modifications 章节未填写,且 Motivation 仅一句话,缺少变更背景和影响说明。建议补充如下内容(可直接复制):

## Motivation
默认开启 FD_ENABLE_E2W_TENSOR_CONVERT 和 FD_ENGINE_TASK_QUEUE_WITH_SHM,
以提升 Engine-to-Worker 的张量传递效率及引擎任务队列的共享内存通信性能。

## Modifications
- `fastdeploy/envs.py`:将 `FD_ENABLE_E2W_TENSOR_CONVERT` 默认值由 `0` 改为 `1`
- `fastdeploy/envs.py`:将 `FD_ENGINE_TASK_QUEUE_WITH_SHM` 默认值由 `0` 改为 `1`

**行为变更说明**- 升级后未显式设置上述环境变量的用户,将自动使用新默认行为
- 如需保持旧行为,可手动设置 `FD_ENABLE_E2W_TENSOR_CONVERT=0` / `FD_ENGINE_TASK_QUEUE_WITH_SHM=0`
- 容器部署需确保 `/dev/shm` 容量充足(建议 ≥ XXX MB)

问题

级别 文件 概述
🟡 建议 fastdeploy/envs.py:180 默认开启 SHM 在容器环境下存在兼容性风险,需在文档中补充资源要求

总体评价

变更本身逻辑简洁清晰,仅调整两行默认值。但这属于默认行为变更,会对所有未显式配置的用户生效,建议在 PR 描述中补充开启原因、性能收益和容器环境注意事项,方便后续版本维护和用户排查问题。

Comment thread fastdeploy/envs.py
"FD_ENABLE_E2W_TENSOR_CONVERT": lambda: int(os.getenv("FD_ENABLE_E2W_TENSOR_CONVERT", "0")),
"FD_ENGINE_TASK_QUEUE_WITH_SHM": lambda: int(os.getenv("FD_ENGINE_TASK_QUEUE_WITH_SHM", "0")),
"FD_ENABLE_E2W_TENSOR_CONVERT": lambda: int(os.getenv("FD_ENABLE_E2W_TENSOR_CONVERT", "1")),
"FD_ENGINE_TASK_QUEUE_WITH_SHM": lambda: int(os.getenv("FD_ENGINE_TASK_QUEUE_WITH_SHM", "1")),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 默认开启 SHM 在容器环境下存在兼容性风险

FD_ENGINE_TASK_QUEUE_WITH_SHM 默认开启后,Engine Task Queue 将使用共享内存(/dev/shm)。在 Docker 容器中,/dev/shm 默认大小通常只有 64MB,高并发场景下可能导致共享内存不足而运行失败(需启动时添加 --shm-size 参数扩容)。

建议在 PR 描述或代码注释中说明:

  • 该功能对 /dev/shm 的最低容量需求(如 ≥ XXX MB
  • 在不支持 SHM 或资源受限的环境中如何回退(设置 FD_ENGINE_TASK_QUEUE_WITH_SHM=0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants