gdb/testsuite/gdb.rocm: extract reusable multi-inferior driver helpers#166
gdb/testsuite/gdb.rocm: extract reusable multi-inferior driver helpers#166spatrang wants to merge 1 commit into
Conversation
lancesix
left a comment
There was a problem hiding this comment.
I have not really thought deeply about it, but the thing which tickles me here is that this helper implicitly relies on properties of the source file (the markers where to insert breakpoints).
If the functions built around those source assumptions are common, I'd expect the source file to be common as well. If we get to a point where we have multiple tests using those helpers, it will get harder to keep the source / tcl bits in sync.
It really feels like the .cpp of multi-inferior-gpu should also be made generic if we go this way.
Done. Rather than add a separate program, I generalized |
f0c9558 to
5a0b205
Compare
|
Rebased on the latest |
Factor the shared non-stop multi-inferior driver logic into two helper procs and a single driver program, so a follow-up stress test can reuse them instead of duplicating the logic. Add multi-inferior.exp.tcl with rocm_multi_inferior_run_to_kernels (set up the session, run the parent to the pre-fork breakpoint, resume, and collect one kernel stop per child) and rocm_multi_inferior_drain (continue each child to a clean exit and run the parent to completion). Tests source it via $srcdir/$subdir/multi-inferior.exp.tcl. Generalize multi-inferior-gpu.cpp into the shared driver program: the child count is taken from argv when given and otherwise defaults to the number of GPU devices found at runtime, and each child re-execs itself through a "child" argv dispatch. Convert multi-inferior-gpu.exp to source the helpers and use the shared program.
5a0b205 to
4e2f05a
Compare
|
Updated to address the review and squashed the branch into a single commit (force-pushed). Changes since the last push:
|
|
Any reason why this wasn't kept with @czidev-amd? |
Since you’ve already reviewed this PR and have the context, I assumed it would be easier for you to continue as the reviewer. |
|
I'm worried that this is being pushed as general reusable infrastructure for testing multi-inferior things when we shouldn't be relying on fork for that. See: Usage of fork in tests should be restricted to testing fork-specific things. |
|
@palves @lumachad — thanks, that's a fair concern and I'd rather settle the mechanism before building more on top of it. To make sure I take the right direction, which of these do you prefer for the multi-inferior driver?
My read of the thread is that (1) is the minimal portable replacement and (2) gives the most realistic stress coverage; happy to start with (1) only if (2) is too much for a first cut. A couple of things I'd like your steer on:
I'll hold further changes until we agree on the approach. |
|
I still think this should be sourced as a separate generic/template .tcl file instead of things living in lib/*.exp.
The goal is to expand test coverage for multi-inferior debugging. Right now these tests rely on fork. So it is merely refactoring that code. I think the Windows testsuite run (if/when they run for this case) will still skip these tests due to the use of fork, right? We should document this is fork-specific/*nix-specific though, if there is a worry someone might misunderstand the restriction. |
|
If we do want to refactor these fork-based tests into something else not using fork, I think that's fair. But then we need to check that against our extra test coverage efforts. |
Why this PR
This is preparatory refactoring split out of #131 at reviewer request.
While reviewing #131 (which adds a new multi-inferior stress test), it
was noted that the new test shares most of its driver logic with the
existing
gdb.rocm/multi-inferior-gpu.exp. Rather than duplicate thatlogic, the common parts are extracted here into shared helpers first, so
that #131 can reuse them and its diff reduces to just what is genuinely
new.
The tests are kept separate (only the driver logic is shared).
Dependent PR
depends on this PR and will be rebased on top of it once this merges.
Summary
Extract the shared non-stop multi-inferior driver logic out of
gdb.rocm/multi-inferior-gpu.expinto two reusable helper procs ingdb/testsuite/lib/rocm.exp, and convert the existing test to use them.Helpers added (
lib/rocm.exp)rocm_multi_inferior_run_to_kernels {args_list expected}— load theprogram, enable non-stop with
detach-on-fork off/follow-fork parent, plant the breakpoints, run the parent to its pre-forkbreakpoint, resume in the background, and collect one kernel
breakpoint stop per child inferior. Returns the list of stopped GPU
thread ids. The child count can be passed explicitly or discovered at
runtime.
rocm_multi_inferior_drain {threads}— continue each stopped GPUinferior to a clean exit, wait for the parent to reach its
post-
waitpidbreakpoint, and run the parent to completion.Behavior note
The extracted driver is intentionally stricter than the inlined
original: it deduplicates GPU stops by inferior, uses literal-matched
regexes, and fails loudly on timeout or a non-zero child exit instead of
hanging. Coverage of the converted test is otherwise unchanged.
Files changed
gdb/testsuite/lib/rocm.exp— add the two helpers.gdb/testsuite/gdb.rocm/multi-inferior-gpu.exp— convert to use them.