io_uring: bound injected completions per dispatcher tick#44665
io_uring: bound injected completions per dispatcher tick#44665aburan28 wants to merge 1 commit intoenvoyproxy:mainfrom
Conversation
The injected-completion drain loop in IoUringImpl::forEveryCompletion was unbounded. A steady stream of injected completions, or completion callbacks that inject more completions, could stall the dispatcher thread indefinitely and starve other event sources sharing the same loop. The TODO at io_uring_impl.cc:78 called this out explicitly. Cap the per-tick drain at a configurable bound (default 1024). When the cap is hit with completions still queued, write to the registered eventfd to re-arm the file event so processing resumes on the next dispatcher tick. The eventfd is non-blocking and is fully drained at the top of the method, so the self-poke is safe. Add a unit test that injects more completions than the cap, asserts the drain is split across multiple ticks, and verifies all completions still fire (proving the eventfd re-arm path works without an explicit activate()). Signed-off-by: Adam Buran <aburan28@gmail.com>
|
Hi @aburan28, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
ravenblackx
left a comment
There was a problem hiding this comment.
Thanks, I like this improvement. One nitty request about the tests, but I'm marking it approved by me early since it's going to need a review from a senior maintainer anyway.
| // Each tick after the first must have been driven by the eventfd self-poke since we never | ||
| // called activate() again — proving the re-arm path works. | ||
| EXPECT_EQ(7, total_completions); | ||
| EXPECT_GE(event_callback_invocations, 3u); |
There was a problem hiding this comment.
Why isn't this EXPECT_EQ? I thought for a moment it might be because other unrelated events could increment it, but it's only incremented by this specific event, so it seems like it would always be exactly 3.
If there is a reason, please add a comment explaining it; if there is not, please make it EXPECT_EQ.
(Also for consistency either total_completions should be uint32_t or event_callback_invocations should be int32_t, they're both just counting up from zero so it's weird that they're different types.)
|
/assign-from @envoyproxy/senior-maintainers |
|
@envoyproxy/senior-maintainers assignee is @zuercher |
Summary
The injected-completion drain loop in
IoUringImpl::forEveryCompletion(io_uring_impl.cc:78) is unbounded. A steady stream of injected completions — or completion callbacks that themselves inject more completions — can stall the dispatcher thread indefinitely and starve other event sources sharing the same loop. The TODO at that line called this out explicitly.This PR caps the per-tick drain at a configurable bound (default 1024). When the cap is reached with completions still queued, it writes to the registered eventfd to re-arm the file event so processing resumes on the next dispatcher tick. The eventfd is non-blocking and is fully drained at the top of
forEveryCompletion, so the self-poke is safe and idempotent.Risk
IoUringWorkerImpl) are unchanged.Test plan
IoUringImplTest.BoundedInjectedCompletionsPerEvent: injects 7 completions with cap=3 and asserts the drain splits as 3/3/1 across three dispatcher ticks, and that all 7 fire without any explicitactivate()call beyond the first — verifying the eventfd self-poke path.Related
Part of a small series of io_uring stability/observability fixes spun out of an internal review of the io_uring path. See also a write-buffer watermark PR coming next.