[CI] replace deprecated pynvml with nvidia-ml-py#7290
[CI] replace deprecated pynvml with nvidia-ml-py#7290ooooo-create wants to merge 7 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Pull request overview
该 PR 旨在将已被标记为 deprecated 的 pynvml 依赖替换为其官方迁移目标 nvidia-ml-py,并同步更新环境收集脚本与相关文档示例,以便在安装与 collect_env 输出中反映新的发行包名。
Changes:
- 在多份 requirements 文件中将依赖从
pynvml替换为nvidia-ml-py - 更新
fastdeploy/collect_env.py的 pip/conda 过滤模式以匹配nvidia-ml-py - 更新中英文
collect-env文档中的示例输出条目
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements.txt | 主依赖列表将 pynvml 替换为 nvidia-ml-py |
| requirements_metaxgpu.txt | metaxgpu 环境依赖同步替换 |
| requirements_iluvatar.txt | iluvatar 环境依赖同步替换 |
| requirements_dcu.txt | dcu 环境依赖同步替换 |
| fastdeploy/collect_env.py | collect-env 的 pip/conda 包匹配模式更新为 nvidia-ml-py |
| docs/cli/collect-env.md | 英文文档示例输出更新 |
| docs/zh/cli/collect-env.md | 中文文档示例输出更新 |
|
PR 描述模板已经补全,并且推送了一个空 commit 来重新触发流水线。 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7290 +/- ##
==========================================
Coverage ? 71.70%
==========================================
Files ? 419
Lines ? 57857
Branches ? 9077
==========================================
Hits ? 41484
Misses ? 13543
Partials ? 2830
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/re-run all-failed |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-21 19:09:21
📋 Review 摘要
PR 概述:将所有 requirements 文件中已废弃的 pynvml 替换为 nvidia-ml-py,并同步更新文档示例输出。
变更范围:CI 依赖配置、文档
影响面 Tag:CI Docs
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🟡 建议 | fastdeploy/collect_env.py:95,110 |
collect_env.py 中 pip/conda 包名搜索模式未同步更新 |
🟡 建议:fastdeploy/collect_env.py 中的包名搜索模式需同步更新
fastdeploy/collect_env.py 的 DEFAULT_CONDA_PATTERNS(第 95 行)和 DEFAULT_PIP_PATTERNS(第 110 行)中仍然使用 "pynvml" 作为搜索模式:
# fastdeploy/collect_env.py (第 92–111 行,未修改)
DEFAULT_CONDA_PATTERNS = {
...
"pynvml", # ← 第 95 行
}
DEFAULT_PIP_PATTERNS = {
...
"pynvml", # ← 第 110 行
}影响:切换为 nvidia-ml-py 后,pip list / conda list 中显示的包名是 nvidia-ml-py 而非 pynvml(尽管 import pynvml 仍然有效)。collect-env 工具将无法检测到该包,导致环境诊断输出中缺少 nvidia-ml-py 的版本信息。文档 diff 中已经把 pynvml==12.0.0 的示例行删除,说明这个包名变化已被预料到,但 collect_env.py 未跟进。
建议修复:
DEFAULT_CONDA_PATTERNS = {
...
"nvidia-ml-py", # 替换 pynvml
}
DEFAULT_PIP_PATTERNS = {
...
"nvidia-ml-py", # 替换 pynvml
}总体评价
依赖替换本身方向正确且改动简洁,import pynvml 调用无需修改的说明也清晰。唯一需要跟进的是 fastdeploy/collect_env.py 中遗漏的包名搜索模式更新,否则 collect-env 命令在新环境中将无法正常展示 nvidia-ml-py 的版本信息。
|
@PaddlePaddle-bot nvidia 已经包含了 nvidia-ml-py |
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览所有 Required 任务均已通过 ✅,PR 可合并。存在 1 个可选任务失败,不阻塞合并,仅供参考。
2 任务状态汇总2.1 Required 任务:10/10 通过
2.2 可选任务 — 26/27 通过
3 失败详情(仅 required)无 required 失败任务。 |
Motivation
https://github.com/gpuopenanalytics/pynvml has been marked as deprecated by the official repository, and now relies on nvidia-ml-py.
Modifications
nvidia-ml-py通过import pynvml进行使用,所以 import 部分不需要进行修改nvidia的匹配已经包含了nvidia-ml-py,故没有更新fastdeploy/collect_env.py,同时兼容性保留pynvml作为依赖展示,判断nvidia-ml-py是否由 pynvml 引入Usage or Command
No new commands added. FastDeploy installation will pull
nvidia-ml-pyfrom pip instead ofpynvml.Accuracy Tests
N/A - this is a dependency replacement for CI/CD environment only. No core model inferences varied.
Checklist
pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.