Skip to content

[Feature]support auto nsys capture skill#7621

Open
rain7996 wants to merge 1 commit intoPaddlePaddle:developfrom
rain7996:develop
Open

[Feature]support auto nsys capture skill#7621
rain7996 wants to merge 1 commit intoPaddlePaddle:developfrom
rain7996:develop

Conversation

@rain7996
Copy link
Copy Markdown

Summary
Support an auto nsys capture skill. Include:

  • Add a new Claude Code skill (ernie5-nsys-capture) that automates Nsight Systems (nsys) GPU profiling for FastDeploy inference
    services
  • The skill provides a 5-step workflow: parameter collection from user's launch script, automatic nvprof_start/nvprof_stop
    injection into gpu_model_runner.py, nsys-wrapped launch script generation, user confirmation, and automated capture execution
  • Includes helper shell scripts (nsys_capture.sh for service health check & file management, ernie5_nsys_capture.sh for request
    timing), a default OpenAI-compatible streaming test client, and supports both coarse/detailed nsys profiling modes
  • Add a lightweight NVTX range utility (fastdeploy/usage/nvtx.py) that gracefully degrades to a no-op when nvtx is not installed

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 26, 2026

Thanks for your contribution!

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


songyuxing seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants