gdb/testsuite: support optimized code in gdb.rocm testsuite#103
gdb/testsuite: support optimized code in gdb.rocm testsuite#103spatrang wants to merge 2 commits into
Conversation
|
Hi @lancesix This PR is a re-submission of the optimized-code series that was originally
The corresponding fix-up commit is on top of this branch. Please take cc: @lumachad |
|
pre-commit seems to be grumpy. |
e05abcd to
12346d2
Compare
|
As stated elsewhere, the series only handles GDB_TESTCASE_OPTIONS. If someone passes CFLAGS directly to the compiler the optimization level checks won't work properly. Do we want to fix that as well? Maybe as a follow up if not in this one? |
|
There's a pre-commit failure. Other than that, I'm still seeing FAIL's for gdb.rocm/watchpoint-basic.exp when running the testsuite with -O1 or -O3. It passes when running with -O0. Could you please check that you're running the testsuite correctly? |
7c21341 to
a912e0e
Compare
|
Did the latest force-pushes provide anything that would fix gdb.rocm/watchpoint-basic.exp? I can't tell. |
7320b08 to
c729dec
Compare
|
For -lto runs, it looks like gdb.rocm/hip-builtin-variables.exp and gdb.rocm/hip-lang-detect.exp are still failing. |
Both hip-builtin-variables.exp and hip-lang-detect.exp now require no_lto. LTO can re-inline across TUs (defeating noinline/optnone boundaries) and rewrite the per-CU DW_LANG attribute, so the test premises don't hold at -flto. Matches the existing require no_lto pattern in watchpoint-at-end-of-shader.exp and mi-aspace.exp. |
c729dec to
65a4a86
Compare
65a4a86 to
24d7933
Compare
24d7933 to
d415a28
Compare
|
Things look clean from -O1 through -O3. I do see a FAIL with -flto though: FAIL: gdb.rocm/hip-builtin-completions.exp: language detection (dwarfdump execution) Since we claim to be clean on -flto runs as well, we should fix that or skip it. |
d415a28 to
ea1067d
Compare
|
Confirmed a clean run for -flto as well. Thanks. |
ea1067d to
42d183e
Compare
There was a problem hiding this comment.
Since we're doing a new pass. Please cleanup the following:
Inconsistent spacing around attribute across the PR.
- gdb/testsuite/gdb.rocm/hip-builtin-variables.cpp:37,45,52,71 — attribute((optnone))
- gdb/testsuite/gdb.rocm/lane-info.cpp:38,49,53 — attribute((noinline)), attribute((noinline, disable_tail_calls))
- gdb/testsuite/gdb.rocm/line-breakpoint-in-kernel.cpp:34 — attribute((optnone))
- gdb/testsuite/gdb.rocm/multi-inferior-fork.cpp:39 — attribute((optnone))
- gdb/testsuite/gdb.rocm/multi-inferior-gpu.cpp:35 — attribute((optnone))
- gdb/testsuite/gdb.rocm/names.cpp:127 — attribute((optnone))
- gdb/testsuite/gdb.rocm/ocp_mx.cpp — five occurrences of attribute((optnone))
- gdb/testsuite/gdb.rocm/scheduler-locking.cpp:36 — attribute((optnone))
- gdb/testsuite/gdb.rocm/static-global.cpp:40 — attribute((optnone))
- gdb/testsuite/gdb.rocm/step-schedlock-spurious-waves.cpp:27 — attribute((optnone))
- gdb/testsuite/gdb.rocm/unaligned-memory-access.cpp:52,57 — attribute((noinline)), attribute((optnone))
- gdb/testsuite/gdb.rocm/watch-gpu-global-from-host.cpp:27 — attribute((optnone))
- gdb/testsuite/gdb.rocm/finish.cpp:225 — new attribute((optnone)) (still missing space, only on the kernel declaration)
- Fix: normalise to attribute ((...)) everywhere added by the patch.
Typos / Comments
- Spelling: inconsistent British/American mix introduced by the PR. New comments use optimiser, optimisation, initialisation, optimised in some files (alu-exceptions.cpp, fork-exec-.cpp, precise-memory-.c, deref-scoped-pointer.exp, hip-builtin-variables.exp,
watchpoint-basic.exp) and optimization/optimized in others (the new lib/rocm.exp predicates, static-global.cpp, several require no_optimized_code skips). Existing gdb tree leans American (optimize); pick one for the whole PR. - gdb/testsuite/gdb.rocm/static-global.cpp:37–39 — the new comment explains why optnone is used but doesn't explain the unrelated change on the same line from static to non-static. The companion file deref-scoped-pointer.cpp:35–37 does call out "Non-static so a line
breakpoint…". Add the same rationale here for consistency.
Bugs
- gdb/testsuite/gdb.rocm/watch-gpu-global-from-host.cpp:34 — silently downgrades a C++ cast to a C-style cast:
- int *devGlobal;
- if (hipGetSymbolAddress (reinterpret_cast<void **> (&devGlobal), global))
+ volatile int *devGlobal;
+ if (hipGetSymbolAddress ((void **) &devGlobal, global))
- The cast change is presumably because reinterpret_cast<void**> won't strip the new volatile qualifier — but the C-style cast hides exactly that fact. This is the kind of cast that the GNU/ROCgdb style explicitly discourages in C++ code. Better: keep an explicit const_cast/reinterpret_cast pair so the volatile strip is visible, or store devGlobal as non-volatile and only mark the access path volatile.
- gdb/testsuite/gdb.rocm/lane-info.cpp:51–55 — the trailing asm volatile ("" : : : "memory"); in bar() is unreachable because sleep_forever() is an endless loop (volatile bool keep_going = true; while (keep_going) …). The author's stated rationale ("volatile asm prevents the call from being constant-folded away at the call site") refers to foo(), where it makes sense. In bar() the disable_tail_calls attribute is the load-bearing piece; the asm is dead. Either drop the asm in bar() and rely on disable_tail_calls, or annotate sleep_forever as attribute ((noreturn)) and remove the dead asm.
- gdb/testsuite/lib/rocm.exp check_for_any_flags (lines ~165–200) — uses lsearch -exact $src $flag, which relies on Tcl's string→list coercion of $src. If a future caller puts a flag with {} or unbalanced quotes into GDB_TESTCASE_OPTIONS/CFLAGS_FOR_TARGET, the implicit list conversion throws. Defensive form: if {[lsearch -exact [split $src] $flag] != -1} — split with no separators yields whitespace-tokenised words and avoids the auto-list coercion. Same logic, harder to break later.
83529fb to
9157ed9
Compare
|
Sent offline comments. |
lumachad
left a comment
There was a problem hiding this comment.
Just one nit this time:
- gdb/testsuite/gdb.rocm/until-tests.exp:76-79 — the test was: until lands on exactly breakpoint_loc+1. After this patch it accepts +1 or +2 unconditionally, even at -O0. That weakens the assertion for the non-optimized path it used to cover. Gate the loose pattern on
[check_for_any_flags {-O1 -O2 -O3}] and keep the strict +1 check for the unoptimized build.
I'd give Lancelot a chance to go through it again.
Thanks! Pushed 86c2e23: gated the (+1|+2) alternation on check_for_any_flags {-O1 -O2 -O3}, restored the strict +1 check at -O0. until-tests.exp 28/28 PASS at -O0 and -O2. |
lancesix
left a comment
There was a problem hiding this comment.
Note that this time I only looked at the overall diff, but having individual commits is still important, I'll check this on next round.
c0fae00 to
3aba2e8
Compare
aktemur
left a comment
There was a problem hiding this comment.
I'm having some difficulty following the changes introduced by patches. Some patches should've been fixups -- they modify previous changes introduced by patches in the same PR. Can we please split, re-order, re-organize the patches so that the changes are focused?
lancesix
left a comment
There was a problem hiding this comment.
Hi,
I did not really re-read evertything, but it seems you push commits to this branch to fix previous commits of the branch still under review (gdb/testsuite/gdb.rocm: drop redundant test decorations for example).
We do not squash commits in GDB, but rather craft series and refine patches in the series as needed. The bits you change in that last patch have to be integrated into previous patches.
3aba2e8 to
41a9d39
Compare
|
Commit refactor (force-pushed). Squashed the series into two single-purpose commits:
Folded in this review round's feedback:
The numerous "this hunk should've been a fixup of the originating patch" comments are addressed by the squash itself -- the intra-file churn from the old multi-commit history is collapsed into these two logical commits. Verified One open question: should the |
Introduce the necessary helper functions into the gdb.rocm
framework so that support for optimized code can be implemented.
- check_for_any_flags: detects whether GDB_TESTCASE_OPTIONS,
CFLAGS_FOR_TARGET or CXXFLAGS_FOR_TARGET contains any of the
given flags. Designed for whitespace-free flags such as
-O2, -flto.
- gdb_caching_proc no_optimized_code: predicate for -O1/-O2/-O3.
- gdb_caching_proc no_lto: predicate for -flto.
Co-Authored-By: Andrey Kasaurov <andrey.kasaurov@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>
Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
Using the framework helpers added in the previous commit, adjust the
gdb.rocm tests so they behave correctly when the test programs are
built with optimization (-O1/-O2/-O3, -flto):
- Adapt test sources and expectations so the tests pass under
optimization (optnone/noinline on the relevant functions,
relaxed regex and line-number expectations, etc.).
- Gate the tests whose semantics cannot be reconciled with
optimized code behind "require no_optimized_code" / "require
no_lto" so they are cleanly reported as unsupported.
Co-Authored-By: Andrey Kasaurov <andrey.kasaurov@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>
Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
|
Rebased onto the latest The only conflicts were in the Re-tested the full
The lower pass count at -O1+ is the expected effect of the |
41a9d39 to
d7d84ec
Compare
Summary
This series makes the
gdb.rocmtestsuite pass when the test programs arecompiled with optimization (
-O1/-O2/-O3,-flto) passed viaGDB_TESTCASE_OPTIONS, rather than the implicit-O0the suite assumes today.Three commits, split by concern:
Update
gdb.rocmframework to support optimized code — adds helpers ingdb/testsuite/lib/rocm.exp:check_for_any_flags {flags}— detects whetherGDB_TESTCASE_OPTIONScontains any of the listed flags.
gdb_caching_proc no_optimized_code— predicate for-O1/-O2/-O3.gdb_caching_proc no_lto— predicate for-flto.Update some
gdb.rocmtests to support optimized code (22 files) —two patterns applied surgically:
.cpp/.c: addvolatileon locals the test inspects, and__attribute__((optnone))on helper functions the test breakpoints into.Add a trivial
volatile intstatement to empty "break here" functionsso their body is not elided.
.exp: relax line-number arithmetic, backtrace PC formatting,expected source strings after
step, and add a bounded retry loop forextra ticks under optimization.
Skip unsupported tests in case of optimized code (19 files) — for tests
whose semantics cannot be reconciled with optimized code, add
require no_optimized_codeat the top;mi-aspace.expalso getsrequire no_lto.register-watchpoint.expsoft-skips the tail viacheck_for_any_flagsafter the earlier portion has run.Diffstat: 42 files changed, +146 / -46.
Context
This continues the work started in internal PR
AMD-ROCm-Internal/ROCgdb#6,
rebased onto the current
amd-stagingof the public repo. The commits arecontent-identical to the internal PR, preserving the original authors
(Aleksandar Rikalo, Andrey Kasaurov) with their
Co-Authored-By:trailersand adding my
Signed-off-by.