Skip to content

[CI] replace deprecated pynvml with nvidia-ml-py#7290

Open
ooooo-create wants to merge 7 commits intoPaddlePaddle:developfrom
ooooo-create:fix-pynvml-deprecated
Open

[CI] replace deprecated pynvml with nvidia-ml-py#7290
ooooo-create wants to merge 7 commits intoPaddlePaddle:developfrom
ooooo-create:fix-pynvml-deprecated

Conversation

@ooooo-create
Copy link
Copy Markdown
Contributor

@ooooo-create ooooo-create commented Apr 10, 2026

Motivation

https://github.com/gpuopenanalytics/pynvml has been marked as deprecated by the official repository, and now relies on nvidia-ml-py.

Modifications

  • Replace pynvml with nvidia-ml-py in all requirements files
  • Update documentation to reflect the new package name

nvidia-ml-py 通过 import pynvml 进行使用,所以 import 部分不需要进行修改
nvidia 的匹配已经包含了 nvidia-ml-py,故没有更新 fastdeploy/collect_env.py,同时兼容性保留 pynvml 作为依赖展示,判断 nvidia-ml-py 是否由 pynvml 引入

Usage or Command

No new commands added. FastDeploy installation will pull nvidia-ml-py from pip instead of pynvml.

Accuracy Tests

N/A - this is a dependency replacement for CI/CD environment only. No core model inferences varied.

Checklist

  • Add at least a tag in the PR title.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Copilot AI review requested due to automatic review settings April 10, 2026 01:13
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 10, 2026

Thanks for your contribution!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

该 PR 旨在将已被标记为 deprecated 的 pynvml 依赖替换为其官方迁移目标 nvidia-ml-py,并同步更新环境收集脚本与相关文档示例,以便在安装与 collect_env 输出中反映新的发行包名。

Changes:

  • 在多份 requirements 文件中将依赖从 pynvml 替换为 nvidia-ml-py
  • 更新 fastdeploy/collect_env.py 的 pip/conda 过滤模式以匹配 nvidia-ml-py
  • 更新中英文 collect-env 文档中的示例输出条目

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
requirements.txt 主依赖列表将 pynvml 替换为 nvidia-ml-py
requirements_metaxgpu.txt metaxgpu 环境依赖同步替换
requirements_iluvatar.txt iluvatar 环境依赖同步替换
requirements_dcu.txt dcu 环境依赖同步替换
fastdeploy/collect_env.py collect-env 的 pip/conda 包匹配模式更新为 nvidia-ml-py
docs/cli/collect-env.md 英文文档示例输出更新
docs/zh/cli/collect-env.md 中文文档示例输出更新

Comment thread docs/cli/collect-env.md
Comment thread docs/cli/collect-env.md
Comment thread docs/zh/cli/collect-env.md
Comment thread docs/zh/cli/collect-env.md
Comment thread docs/cli/collect-env.md Outdated
PaddlePaddle-bot

This comment was marked as outdated.

@ooooo-create ooooo-create changed the title chore: replace deprecated pynvml with nvidia-ml-py [CI] replace deprecated pynvml with nvidia-ml-py Apr 10, 2026
PaddlePaddle-bot

This comment was marked as outdated.

@ooooo-create
Copy link
Copy Markdown
Contributor Author

PR 描述模板已经补全,并且推送了一个空 commit 来重新触发流水线。

PaddlePaddle-bot

This comment was marked as outdated.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@dae246e). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7290   +/-   ##
==========================================
  Coverage           ?   71.70%           
==========================================
  Files              ?      419           
  Lines              ?    57857           
  Branches           ?     9077           
==========================================
  Hits               ?    41484           
  Misses             ?    13543           
  Partials           ?     2830           
Flag Coverage Δ
GPU 71.70% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@luotao1 luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Apr 14, 2026
Copilot AI review requested due to automatic review settings April 15, 2026 12:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Comment thread fastdeploy/collect_env.py
Comment thread fastdeploy/collect_env.py
PaddlePaddle-bot

This comment was marked as outdated.

@ooooo-create
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

Copilot AI review requested due to automatic review settings April 21, 2026 11:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-21 19:09:21

📋 Review 摘要

PR 概述:将所有 requirements 文件中已废弃的 pynvml 替换为 nvidia-ml-py,并同步更新文档示例输出。
变更范围:CI 依赖配置、文档
影响面 TagCI Docs


问题

级别 文件 概述
🟡 建议 fastdeploy/collect_env.py:95,110 collect_env.py 中 pip/conda 包名搜索模式未同步更新

🟡 建议:fastdeploy/collect_env.py 中的包名搜索模式需同步更新

fastdeploy/collect_env.pyDEFAULT_CONDA_PATTERNS(第 95 行)和 DEFAULT_PIP_PATTERNS(第 110 行)中仍然使用 "pynvml" 作为搜索模式:

# fastdeploy/collect_env.py (第 92–111 行,未修改)
DEFAULT_CONDA_PATTERNS = {
    ...
    "pynvml",   # ← 第 95 行
}

DEFAULT_PIP_PATTERNS = {
    ...
    "pynvml",   # ← 第 110 行
}

影响:切换为 nvidia-ml-py 后,pip list / conda list 中显示的包名是 nvidia-ml-py 而非 pynvml(尽管 import pynvml 仍然有效)。collect-env 工具将无法检测到该包,导致环境诊断输出中缺少 nvidia-ml-py 的版本信息。文档 diff 中已经把 pynvml==12.0.0 的示例行删除,说明这个包名变化已被预料到,但 collect_env.py 未跟进。

建议修复

DEFAULT_CONDA_PATTERNS = {
    ...
    "nvidia-ml-py",   # 替换 pynvml
}

DEFAULT_PIP_PATTERNS = {
    ...
    "nvidia-ml-py",   # 替换 pynvml
}

总体评价

依赖替换本身方向正确且改动简洁,import pynvml 调用无需修改的说明也清晰。唯一需要跟进的是 fastdeploy/collect_env.py 中遗漏的包名搜索模式更新,否则 collect-env 命令在新环境中将无法正常展示 nvidia-ml-py 的版本信息。

@ooooo-create
Copy link
Copy Markdown
Contributor Author

@PaddlePaddle-bot nvidia 已经包含了 nvidia-ml-py

@PaddlePaddle-bot
Copy link
Copy Markdown

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-04-27 11:06:30

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

所有 Required 任务均已通过 ✅,PR 可合并。存在 1 个可选任务失败,不阻塞合并,仅供参考。

总执行(含 rerun) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
0 37 36 1 0 0 0

2 任务状态汇总

2.1 Required 任务:10/10 通过

必选任务阻塞合并,失败需优先处理。

状态 任务 耗时 重跑 根因 修复建议 日志
其余 10 个必选任务通过 - - - - -

2.2 可选任务 — 26/27 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 重跑 日志
CI_HPU 1h4m - 查看
其余 26 个可选任务通过 - - -

3 失败详情(仅 required)

无 required 失败任务。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants