Add ROCm test suite (companion to #4720) by danielhanchen · Pull Request #4824 · unslothai/unsloth

danielhanchen · 2026-04-03T11:36:52Z

Summary

95 Python tests (tests/studio/install/test_rocm_support.py) and 23 shell tests (tests/sh/test_get_torch_index_url.sh) covering AMD ROCm detection, torch index URL selection, hardware flags, prebuilt asset selection, and install pathway logic
All tests use mocks -- no AMD hardware required
Split out from Add AMD ROCm/HIP support across installer and hardware detection #4720 to keep the main implementation PR focused

Test plan

python -m pytest tests/studio/install/test_rocm_support.py -v
bash tests/sh/test_get_torch_index_url.sh

95 Python tests and 23 shell tests covering ROCm detection, torch index URL selection, hardware flags, prebuilt asset selection, and install pathway logic. All tests use mocks -- no AMD hardware required. Companion to #4720 (AMD ROCm/HIP support).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53e782f7b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-03T11:40:36Z

tests/studio/install/test_rocm_support.py

+sys.modules[_STACK_SPEC.name] = stack_mod
+_STACK_SPEC.loader.exec_module(stack_mod)
+
+_detect_rocm_version = stack_mod._detect_rocm_version


Defer binding ROCm symbols until existence is verified

These module-level assignments unconditionally access private ROCm helpers from studio/install_python_stack.py; on the parent revision (8981e6c) those attributes are absent, so pytest aborts during collection with AttributeError before any test can run. This makes the commit non-runnable unless the companion implementation lands first, so the test module should guard missing symbols (for example via getattr/hasattr + pytest.skip) or be merged together with the implementation.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-03T11:40:36Z

tests/sh/test_get_torch_index_url.sh

+# 9) ROCm 6.3 (no nvidia-smi) -> rocm6.3
+_dir=$(make_mock_amd_smi "6.3")
+_result=$(run_func "$_dir")
+assert_eq "ROCm 6.3 -> rocm6.3" "https://download.pytorch.org/whl/rocm6.3" "$_result"


Avoid ROCm URL assertions before install.sh supports ROCm

This new expectation (and the related ROCm cases below it) currently fails deterministically against the base install.sh logic, which returns the CPU index whenever nvidia-smi is missing; running bash tests/sh/test_get_torch_index_url.sh yields multiple ROCm failures and exits non-zero. As a result, merging this test-only change ahead of the matching installer implementation will break CI.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request introduces comprehensive testing for AMD ROCm support, including shell script tests for torch index URL resolution and a new Python test suite covering various installation pathways. The feedback identifies a critical issue where the new tests reference a missing amd.py file, which will likely cause CI failures. Additionally, there are recommendations to avoid global state leakage by using proper mocking for sys.modules and to replace broad, silent exception handlers with more descriptive error reporting in test setup blocks.

gemini-code-assist · 2026-04-03T11:43:20Z

tests/studio/install/test_rocm_support.py

+    def test_amd_py_exists(self):
+        """amd.py should exist in the hardware directory."""
+        amd_path = PACKAGE_ROOT / "studio" / "backend" / "utils" / "hardware" / "amd.py"
+        assert amd_path.exists()


This test asserts the existence of amd.py, which is not included in this pull request. While this PR is a companion to #4720, including tests that are guaranteed to fail against the current target branch will break CI. Consider either including the necessary file changes in this PR or marking these tests as expected to fail/skipped until the implementation is merged.

gemini-code-assist · 2026-04-03T11:43:20Z

tests/studio/install/test_rocm_support.py

+        sys.modules["structlog"] = MagicMock()
+        sys.modules["loggers"] = MagicMock()
+        sys.modules["loggers"].get_logger = MagicMock(return_value = MagicMock())
+        sys.modules["utils"] = MagicMock()
+        sys.modules["utils.hardware"] = MagicMock()


Modifying sys.modules directly within a test method causes global state leakage that can affect other tests in the suite or subsequent test files. It is recommended to use unittest.mock.patch.dict to ensure that these modifications are scoped to the test and automatically reverted upon completion. This pattern is repeated in several other tests in this file (e.g., lines 1093, 1131, 1166, 1186).

gemini-code-assist · 2026-04-03T11:43:21Z

tests/studio/install/test_rocm_support.py

+        try:
+            _worker_spec.loader.exec_module(worker_mod)
+        except Exception:
+            pytest.skip("Could not load worker module in test environment")


This broad except Exception block silently skips the test on any failure during module execution. This violates the general rule against broad, silent exception handlers and makes it difficult to distinguish between environment issues and actual bugs in the code under test. Please catch more specific exceptions or at least include the exception details in the skip message.

Suggested change

try:

_worker_spec.loader.exec_module(worker_mod)

except Exception:

pytest.skip("Could not load worker module in test environment")

try:

_worker_spec.loader.exec_module(worker_mod)

except Exception as e:

pytest.skip(f"Could not load worker module in test environment: {e}")

References

Avoid using broad, silent exception handlers like except Exception: pass. Instead, log the exception, even if at a debug level, to aid in future debugging.

Add ROCm test suite for PR #4720

53e782f

95 Python tests and 23 shell tests covering ROCm detection, torch index URL selection, hardware flags, prebuilt asset selection, and install pathway logic. All tests use mocks -- no AMD hardware required. Companion to #4720 (AMD ROCm/HIP support).

danielhanchen requested a review from rolandtannous as a code owner April 3, 2026 11:36

chatgpt-codex-connector bot reviewed Apr 3, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 3, 2026

View reviewed changes

danielhanchen mentioned this pull request Apr 3, 2026

Add AMD ROCm/HIP support across installer and hardware detection #4720

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ROCm test suite (companion to #4720)#4824

Add ROCm test suite (companion to #4720)#4824
danielhanchen wants to merge 1 commit intomainfrom
feature/rocm-tests

danielhanchen commented Apr 3, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 3, 2026

Uh oh!

chatgpt-codex-connector bot Apr 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

danielhanchen commented Apr 3, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant