uv/pixi CLI Version
0.70.2
Mojo Version
1.0.0b2.dev2026060906 (26da54e2)
GPU Information
NVIDIA GeForce RTX 5070 Ti (12.0)
Operating System
Arch Linux
Related Puzzle
Puzzle 9: GPU Debugging Workflow
What happened?
-
Run the 'Detective Work Puzzles' (three methods):
a pixi run -e nvidia p09 --first-case doesn't crash! (Just outputs garbage) result: HostBuffer([-8.103731e+22, 216419.02, 148412.05, 10.0])
b. JIT method ptx error (see below): pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --first-case (running wtih run/r in cuda-gdb)
c. Or, build with debug info same ptx error (see below): pixi run -e nvidia mojo build -O0 -g problems/p09/p09.mojo -o problems/p09/p09_debug
-
For methods b & c, the error output:
/home/tgale/dev/mojo-gpu-puzzles/problems/p09/p09.mojo:1:1: error: ptxas application ptx input, line 96; error : Output parameter cannot be an incomplete array.
ptxas application ptx input, line 125; error : Output parameter cannot be an incomplete array.
ptxas application ptx input, line 600; error : Call has wrong number of parameters
ptxas application ptx input, line 649; error : Call has wrong number of parameters
ptxas fatal : Ptx assembly aborted due to errors
# ===----------------------------------------------------------------------=== #
^
/home/tgale/dev/mojo-gpu-puzzles/.pixi/envs/nvidia/bin/mojo: error: failed to run the pass manager
- I would expect JIT/build with debug symbols to compile so that we can perform the debugging/detective exercises/steps outlined in the three tutorial cases.
Note, I was able to build with default optimization (pixi run -e nvidia mojo build -g problems/p09/p09.mojo -o problems/p09/p09_debug), but that meant that I was debugging things with optimized away symbols - I appreciate this was not the intention of the tutorial!
E.g.
pixi run -e nvidia mojo build -g problems/p09/p09.mojo -o problems/p09/p09_debug
pixi run -e nvidia cuda-gdb problems/p09/p09_debug
set cuda break_on_launch all
run --first-case
Able to see the initialized input memory (the intention of the tutorial)
(cuda-gdb) set cuda break_on_launch all
(cuda-gdb) run --first-case
Starting program: /home/tgale/dev/mojo-gpu-puzzles/problems/p09/p09_debug --first-case
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7ffff31ff6c0 (LWP 25043)]
[New Thread 0x7ffff21fe6c0 (LWP 25044)]
[New Thread 0x7ffff11fd6c0 (LWP 25045)]
[New Thread 0x7ffff01fc6c0 (LWP 25046)]
[New Thread 0x7fffef1fb6c0 (LWP 25047)]
[New Thread 0x7fffee1fa6c0 (LWP 25048)]
[New Thread 0x7fffed1f96c0 (LWP 25049)]
[New Thread 0x7fffec1f86c0 (LWP 25050)]
First Case: Try to identify what's wrong without looking at the code!
[New Thread 0x7fffe47ff6c0 (LWP 25052)]
[New Thread 0x7fffe29ff6c0 (LWP 25053)]
[Detaching after fork from child process 25054]
[New Thread 0x7fffdbfff6c0 (LWP 25065)]
[New Thread 0x7fffdaffe6c0 (LWP 25066)]
[New Thread 0x7ffdbbfff6c0 (LWP 25067)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
CUDA thread hit application kernel entry function breakpoint, p09_add_10_UnsafePointer_Bo6A6AoA6A6AcB6A6A_0f8da7b69379d471<<<(1,1,1),(4,1,1)>>> (output=0x7ffa84000000, a=0x7fffffff93b0) at /home/tgale/dev/mojo-gpu-puzzles/problems/p09/p09.mojo:31
31 var i = thread_idx.x
(cuda-gdb) print a[0]@4
$1 = {{2.25695233e-39}, {0}, {1.40129846e-45}, {0}}
Steps to Reproduce
- Puzzle 9: GPU Debugging Workflow
- Run, JIT:
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --first-case or attempt to build with debug: pixi run -e nvidia mojo build -O0 -g problems/p09/p09.mojo -o problems/p09/p09_debug
- Observe in output:
/home/tgale/dev/mojo-gpu-puzzles/problems/p09/p09.mojo:1:1: error: ptxas application ptx input, line 96; error : Output parameter cannot be an incomplete array.
ptxas application ptx input, line 125; error : Output parameter cannot be an incomplete array.
ptxas application ptx input, line 600; error : Call has wrong number of parameters
ptxas application ptx input, line 649; error : Call has wrong number of parameters
ptxas fatal : Ptx assembly aborted due to errors
Relevant Log Output
Additional Context
I assume this issue is related to modularbot bumping the mojo version in the pixi.lock to latest nightly and not having tests for regressions to tutorial 9? (https://github.com/modular/mojo-gpu-puzzles/commits?author=modularbot)
Code of Conduct
uv/pixi CLI Version
0.70.2
Mojo Version
1.0.0b2.dev2026060906 (26da54e2)
GPU Information
NVIDIA GeForce RTX 5070 Ti (12.0)
Operating System
Arch Linux
Related Puzzle
Puzzle 9: GPU Debugging Workflow
What happened?
Run the 'Detective Work Puzzles' (three methods):
a
pixi run -e nvidia p09 --first-casedoesn't crash! (Just outputs garbage)result: HostBuffer([-8.103731e+22, 216419.02, 148412.05, 10.0])b. JIT method ptx error (see below):
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --first-case(running wtihrun/rincuda-gdb)c. Or, build with debug info same ptx error (see below):
pixi run -e nvidia mojo build -O0 -g problems/p09/p09.mojo -o problems/p09/p09_debugFor methods b & c, the error output:
Note, I was able to build with default optimization (
pixi run -e nvidia mojo build -g problems/p09/p09.mojo -o problems/p09/p09_debug), but that meant that I was debugging things with optimized away symbols - I appreciate this was not the intention of the tutorial!E.g.
pixi run -e nvidia mojo build -g problems/p09/p09.mojo -o problems/p09/p09_debugpixi run -e nvidia cuda-gdb problems/p09/p09_debugset cuda break_on_launch allrun --first-caseAble to see the initialized input memory (the intention of the tutorial)
Steps to Reproduce
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --first-caseor attempt to build with debug:pixi run -e nvidia mojo build -O0 -g problems/p09/p09.mojo -o problems/p09/p09_debugRelevant Log Output
Additional Context
I assume this issue is related to modularbot bumping the mojo version in the pixi.lock to latest nightly and not having tests for regressions to tutorial 9? (https://github.com/modular/mojo-gpu-puzzles/commits?author=modularbot)
Code of Conduct