Collect kernel artifacts and append-mode autotune telemetry with run_id by IshanAryendu · Pull Request #2737 · pytorch/helion

IshanAryendu · 2026-06-10T00:18:28Z

Collect kernel artifacts from real autotuning runs in CI

Tweak the autotune telemetry of Helion into a reliable, joinable dataset that can be collected from CI and used to build a cost-model / kernel-artifact corpus. It builds on the existing kernel_id / sample_id / per-config CSV + identity sidecar.

Changes:

Convert the autotune-log sink into append-mode for capturing all kernels and input shapes of the CI job in a single file instead of overwriting them.
Change the sidecar from a single truncating .meta.json to an appended .meta.jsonl which collects one record per autotune run
Adding run_id, a content-derived foreign key that uniquely identifies one autotune invocation, stamped on every CSV row and meta record so per-config measurements can be attributed to the exact shape/dtype/hardware they were measured on.
Recording the decorator string (@helion.kernel(config=…)) as a structured CSV column.
Wiring the benchmark CI workflow to emit these artifacts per kernel and upload them with the existing benchmark artifact.

…join key

…he decorator already computed for sample_id and record it per row, completing the kernel-artifact set (source, input shapes, decorator).

IshanAryendu · 2026-06-10T18:44:43Z

@claude

claude · 2026-06-10T18:45:28Z

Claude finished @IshanAryendu's task in 3m 47s —— View job

Code review

Read the diff vs origin/main
Review changed files for bugs / correctness / style
Post review feedback

Reviewed the full diff (helion/autotuner/{metrics,logger,base_search,benchmark_provider}.py, the benchmark workflow, docs, and tests). Overall this is a clean, well-documented change: the run_id join key, append-mode sidecars, and the decorator artifact column all hang together, and the test coverage (test_run_id_is_derived_and_distinguishes_shapes, test_sink_appends_across_runs_at_same_base_path) is good — it exercises both the derive-vs-pinned run_id paths and the cross-run accumulation case. No correctness bugs that would break the CI collection path.

A few things worth considering:

1. CSV append mode is fragile to schema drift across versions

AutotuneLogSink.open() (helion/autotuner/logger.py:323) writes the header only when the file is new/empty:

write_header = not self.csv_path.exists() or self.csv_path.stat().st_size == 0
self._csv_file = self.csv_path.open("a", ...)

If a .csv written by an older Helion (the pre-PR 9-column schema, no run_id/decorator) already exists at the base path, the new 11-column rows get appended under the stale header with no indication, silently misaligning every column. In CI this is safe (fresh TEST_REPORTS_DIR each run), but for a persistent HELION_AUTOTUNE_LOG it produces a corrupt file. Consider validating the existing header against the expected columns when not writing one, and either erroring or rotating on mismatch.

2. Concurrent writers to a shared base path can interleave / double-header

Append mode assumes a single writer. If two processes autotune into the same HELION_AUTOTUNE_LOG, the write_header check is a TOCTOU race (both can see an empty file and both write a header), and CSV/JSONL line writes from different processes can interleave. The benchmark workflow sidesteps this with a per-kernel path (autotune-$kernel) in a sequential loop, so it's fine as used — but it's an implicit constraint of the design. A one-line note in the autotune_log docs that the base path must not be shared by concurrent processes would save a future debugging session.

3. Stale comment: join key is `run_id`, not `kernel_id`

The comment in open() (helion/autotuner/logger.py:317-318) says:

# ... CSV rows join back to
# these records via kernel_id.

But the rest of the PR (and record() at logger.py:382-384) correctly describes the precise join as run_id (many-to-one), with kernel_id being the coarser cross-shape grouping key. Worth aligning this comment to avoid contradicting the new model. Fix this →

4. Minor: `run_id` for an empty/unidentified kernel is a fixed nonzero hash

With default/empty KernelMetadata, __post_init__ still computes run_id = sha256("\x00\x00\x00") — a constant non-empty value, while kernel_id stays "". Through the sink that's masked (run_id is only emitted when metadata is present), so it's harmless, but it means "unidentified" runs all collapse to one shared run_id rather than an empty one. If that matters for downstream dedup, you may want to short-circuit to "" when kernel_id is empty. test_default_metadata_has_empty_identity currently doesn't assert on run_id, so this behavior is untested either way.

None of these block the PR for its CI use case (#1/#2 are guarded by the per-kernel fresh-dir setup); #3 is a trivial doc fix and #1 is the most worth hardening if autotune_log is ever pointed at a long-lived path.
• collect-kernel-artifacts

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 10, 2026

IshanAryendu added 2 commits June 10, 2026 10:32

Collect kernel artifacts: append-mode autotune telemetry with run_id …

1295959

…join key

telemetry: collect decorator string as a structured CSV columnReuse t…

44334ef

…he decorator already computed for sample_id and record it per row, completing the kernel-artifact set (source, input shapes, decorator).

IshanAryendu force-pushed the collect-kernel-artifacts branch from 6db0d5c to 44334ef Compare June 10, 2026 17:46

IshanAryendu requested review from angelayi and karthickai June 10, 2026 20:04

IshanAryendu mentioned this pull request Jun 11, 2026

Collect kernel artifacts: device-IR node-link dump (.ir.jsonl) #2750

Draft

IshanAryendu added 2 commits June 10, 2026 18:54

Merge branch 'pytorch:main' into collect-kernel-artifacts

bc91d52

Merge branch 'pytorch:main' into collect-kernel-artifacts

4c83a72

IshanAryendu changed the title ~~Collect kernel artifacts and append-mode autotune telemetry with run_idCollect kernel artifacts~~ Collect kernel artifacts and append-mode autotune telemetry with run_id Jun 11, 2026

IshanAryendu marked this pull request as draft June 11, 2026 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect kernel artifacts and append-mode autotune telemetry with run_id#2737

Collect kernel artifacts and append-mode autotune telemetry with run_id#2737
IshanAryendu wants to merge 4 commits into
pytorch:mainfrom
IshanAryendu:collect-kernel-artifacts

IshanAryendu commented Jun 10, 2026 •

edited

Loading

Uh oh!

IshanAryendu commented Jun 10, 2026

Uh oh!

claude Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

IshanAryendu commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Collect kernel artifacts from real autotuning runs in CI

Uh oh!

IshanAryendu commented Jun 10, 2026

Uh oh!

claude Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code review

1. CSV append mode is fragile to schema drift across versions

2. Concurrent writers to a shared base path can interleave / double-header

3. Stale comment: join key is run_id, not kernel_id

4. Minor: run_id for an empty/unidentified kernel is a fixed nonzero hash

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

IshanAryendu commented Jun 10, 2026 •

edited

Loading

claude Bot commented Jun 10, 2026 •

edited

Loading

3. Stale comment: join key is `run_id`, not `kernel_id`

4. Minor: `run_id` for an empty/unidentified kernel is a fixed nonzero hash