Skip to content

Fix macOS py3.11 xdist flake in remote platform test#4270

Open
KevinSailema wants to merge 1 commit intoNVIDIA:mainfrom
KevinSailema:fix/issue-4259-xdist-group
Open

Fix macOS py3.11 xdist flake in remote platform test#4270
KevinSailema wants to merge 1 commit intoNVIDIA:mainfrom
KevinSailema:fix/issue-4259-xdist-group

Conversation

@KevinSailema
Copy link
Copy Markdown

Summary

This PR fixes the flaky macOS pytest-xdist failure reported in #4259, where the test suite could crash with worker termination during remote-mqpu execution.

Root Cause

The remote platform test module could be executed concurrently by multiple xdist workers.
Each worker auto-launches local remote-mqpu REST server processes, which can race and destabilize the worker process on macOS (especially Python 3.11).

Change

  • Added a module-level xdist grouping marker in test_remote_platform.py:
    • pytest.mark.xdist_group("remote_mqpu_platform")
  • This forces all tests in that module to run on a single xdist worker.
  • The previously flaky test continues to run under xdist (no skip-based suppression).

Why this is the right fix

  • It addresses the concurrency source of the flake instead of masking it.
  • It preserves coverage of the failing test path.
  • It is minimal and isolated to one test module.

Validation

Validated on macOS with multiple Python versions:

  • Python 3.11:
    • test_multi_qpus_kernel under xdist: 20/20 passes
    • full module under xdist: 10/10 runs, all passing (21 passed each run)
  • Python 3.12:
    • full module under xdist: passing
  • Python 3.13:
    • full module under xdist: passing (existing deprecation warnings only)

Impact

  • Stabilizes CI behavior for the exact failure mode in #4259.
  • No runtime or production behavior changes.
  • Test scheduling change only, scoped to the remote platform test module.

Fixes #4259

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Kevin <kevin@Kevins-MacBook-Pro-3.local>
@mitchdz
Copy link
Copy Markdown
Collaborator

mitchdz commented Apr 7, 2026

Thanks for this Kevin! I'll run the CI workflow now to see how it works.

@mitchdz
Copy link
Copy Markdown
Collaborator

mitchdz commented Apr 7, 2026

/ok to test 836ec6f

Command Bot: Processing...

github-actions bot pushed a commit that referenced this pull request Apr 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] - macos validate failure test_remote_platform.py::test_multi_qpus_kernel

2 participants