Add GPU testing workflows and add gpu tox configuration#635
Conversation
AVHopp
left a comment
There was a problem hiding this comment.
First round of comments.
There was a problem hiding this comment.
Pull Request Overview
This PR adds GPU testing capabilities to the project by introducing a dedicated Tox environment for GPU tests and workflow configurations. The changes include manual workflow triggers for GPU testing, system information outputs for benchmarking, and modernization of the benchmark installation process.
- Adds a new
gputestTox environment for GPU-specific testing - Introduces manual GPU testing workflows that require core tests to pass first
- Modernizes benchmark installation by switching from pip to uv package manager
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tox.ini | Adds gputest environment configuration with GPU availability check |
| .github/workflows/regular.yml | Adds GPU testing job triggered by manual workflow dispatch |
| .github/workflows/gpu_tests.yml | New dedicated workflow for GPU tests with AWS Lambda runner provisioning |
| .github/workflows/ci.yml | Adds GPU testing job to CI workflow with manual trigger |
| .github/workflows/benchmark.yml | Adds system information output and migrates to uv package manager |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
AVHopp
left a comment
There was a problem hiding this comment.
LGTM, only thing to do is to remove the single precision env variables, but there is already a comment for that.
AdrianSosic
left a comment
There was a problem hiding this comment.
Thanks, @fabianliebig, for taking the initiative here. I've gone through the changes and comments by others and there's no obvious reason to block anything from my side. However, it's one of the typical fabian knows this stuff so much better than me PRs that I'm more or less trusting you blindly here and will only potentially complain once I actually start using it and encounter issues, i.e. once our GPU code is ready 😄
Thanks for the review and trust :). Please don't hesitate to reach out anytime if you run into problems or even just have questions. I'm happy to help troubleshoot. Working with GPU can be quite messy and there are nearly always issues and synergies that weren't foreseeable :D I think if the pipeline code and TOX environment work for you, and since Torch can already detect the GPU, that's a good starting point to explore GPU usage :) Feel free to merge once you think it fits and as said ping me if you need me :D |
…s into a single job
…dundant info from GPU tests
Update the concurrency group name for the GPU Test Workflow by adding an extension since the calling workflow, which is in the same concurrency group, will block this one, which leads to a deadlock, and the pipeline will skip this call.
Co-authored-by: Martin Fitzner <martin.fitzner@merckgroup.com>
6e656c7 to
c5a60c2
Compare
This PR adds a dedicated Tox environment for GPU-related tests and one manual workflow as well as additional steps to regular and CI for installing and starting those tests. At the moment, they are only executed if triggered manually and if core tests have passed. Support for other triggers is WIP. Additionally, I've changed the Benchmark installation to UV and added terminal outputs of standard resource commands for CPU, RAM and so on.