Skip to content

[bench] wvSplitK skinny GEMM: capture timed iters into a CUDA graph#928

Draft
mgehre-amd wants to merge 4 commits into
gfx11from
matthias.bench-skinny-gemm-cudagraph
Draft

[bench] wvSplitK skinny GEMM: capture timed iters into a CUDA graph#928
mgehre-amd wants to merge 4 commits into
gfx11from
matthias.bench-skinny-gemm-cudagraph

[bench] wvSplitK: per-kernel timing inside captured CUDA graph

c52ff9e
Select commit
Loading
Failed to load commit list.