Metal: fix CPU readback race under concurrent command submission#3595
Merged
Conversation
CPU readbacks (MetalStorage::to_cpu, QMetalStorage::{data, dequantize})
encoded a blit into the shared rotating command buffer and then called
wait_until_completed, which commits the current buffer and waits on the
last in-flight one. With concurrent submissions from other threads, the
in-flight list can be taken by another thread's flush_and_wait between
the blit encode and the wait, causing the reader to return before its
blit has executed and to read stale or unwritten destination memory.
Fix: add Commands::flush_and_wait_current, which commits the current
command buffer while holding the state lock and waits on that specific
buffer. Command queues execute buffers in commit order, so completion
of this buffer also covers anything committed before it by other
threads. Completed buffers are drained from the in-flight list to keep
it bounded under readback-heavy workloads; errored buffers are kept for
reporting. The three readback sites now use it.
The new concurrent readback tests fail on every thread within a few
iterations against the previous behavior.
ivarflakstad
approved these changes
Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CPU readbacks (
MetalStorage::to_cpu,QMetalStorage::{data, dequantize}) encoded a blit into the shared rotating command buffer and then calledwait_until_completed, which commits the current buffer and waits on the last in-flight one. With concurrent submissions from other threads, the in-flight list can be taken by another thread'sflush_and_waitbetween the blit encode and the wait, causing the reader to return before its blit has executed and to read stale or unwritten destination memory.This PR adds
Commands::flush_and_wait_current, which commits the current command buffer while holding the state lock and waits on that specific buffer. Command queues execute buffers in commit order, so completion of this buffer also covers anything committed before it by other threads. Completed buffers are drained from the in-flight list to keep it bounded under readback-heavy workloads.