Skip to content

[AutoDiff] Autodiff 7: Fix header size in AdStackAllocaStmt to match u64 runtime layout#534

Open
duburcqa wants to merge 1 commit intoduburcqa/fix_ad_correctnessfrom
duburcqa/split_llvm_adstack_header_size
Open

[AutoDiff] Autodiff 7: Fix header size in AdStackAllocaStmt to match u64 runtime layout#534
duburcqa wants to merge 1 commit intoduburcqa/fix_ad_correctnessfrom
duburcqa/split_llvm_adstack_header_size

Conversation

@duburcqa
Copy link
Copy Markdown
Contributor

@duburcqa duburcqa commented Apr 21, 2026

Fix header size in AdStackAllocaStmt to match u64 runtime layout

One-line mechanical fix. AdStackAllocaStmt::size_in_bytes() used sizeof(int32) for the header slot, but the runtime lays out the header as a u64. The LLVM alloca was 4 bytes smaller than the runtime expected, so the header write could clobber the first primal slot.

TL;DR

 std::size_t size_in_bytes() const {
-  return sizeof(int32) + entry_size_in_bytes() * max_size;
+  // Header is a `u64` (see `stack_init`/`stack_push`/`stack_top_primal` in runtime.cpp), so use
+  // `sizeof(int64)` - not `sizeof(int32)` - to size the LLVM alloca matching the runtime layout.
+  return sizeof(int64) + entry_size_in_bytes() * max_size;
 }

Why

The runtime code reads and writes the header as a u64:

void stack_init(Ptr stack) {
  *(u64 *)stack = 0;  // <-- 8 bytes
}
void stack_push(LLVMRuntime *runtime, Ptr stack, size_t max_num_elements, std::size_t element_size) {
  u64 &n = *(u64 *)stack;
  // ...
}
Ptr stack_top_primal(Ptr stack, std::size_t element_size) {
  auto n = *(u64 *)stack;
  return stack + sizeof(u64) + idx * 2 * element_size;  // <-- 8-byte offset
}

codegen_llvm.cpp's visit(AdStackAllocaStmt) creates an alloca of stmt->size_in_bytes() bytes at 8-byte alignment. With the old sizeof(int32) in the size computation, the alloca was 4 + max_size * entry_bytes bytes, but the runtime's stack_init writes 8 bytes — overwriting the start of what should be the first primal slot, or adjacent stack memory if max_size == 0. At 8-byte alignment the allocator may round up to the next 8-byte boundary, which happens to cover for small max_size; at larger max_size it does not, and subsequent push/pop indexing then treats the primal area as 4 bytes offset from where it actually starts.

The regression is extremely narrow: every push / pop / load-top site in the runtime agrees with the sizeof(u64) header width; only the alloca-sizing helper was stale. Fixing it is a 1-line change plus a comment.

Why no dedicated test

A dedicated Python test for this specific bug would need to observe the 4-byte mis-sizing, which only manifests as silent corruption of neighboring stack memory or mis-indexed primal reads — both are deterministic in principle but highly sensitive to max_size, alignment, and the allocator's rounding behaviour at the LLVM frame-layout level. The downstream tests in this stack (every adstack grad test in Autodiff 1 / Autodiff 6 / Autodiff 8, and the more intensive ones in Autodiff 10+) all exercise the alloca correctness indirectly: a mis-sized header would mis-index every primal slot and the gradient comparison against PyTorch / the analytical expected value would fail.

Stack

Autodiff 7 of 13. First commit of the "LLVM adstack safety" triplet split. Based on #491 (regression tests). Followed by #535 (runtime overflow).

@duburcqa duburcqa changed the title [AutoDiff] Fix header size in AdStackAllocaStmt to match u64 runtime layout [AutoDiff] Autodiff 7: Fix header size in AdStackAllocaStmt to match u64 runtime layout Apr 21, 2026
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — correct one-line fix aligning the alloca size with the u64 runtime header layout.

Extended reasoning...

Overview

Single-line change in AdStackAllocaStmt::size_in_bytes() in quadrants/ir/statements.h. The fix corrects the header slot size from sizeof(int32) (4 bytes) to sizeof(int64) (8 bytes) to match the u64 layout used by stack_init/stack_push/stack_top_primal in the runtime.

Security risks

None. This is a memory-layout correctness fix inside the autodiff stack allocator. It makes allocations larger, not smaller, so it eliminates a potential out-of-bounds write rather than introducing one.

Level of scrutiny

Low. The change is mechanical and isolated: one sizeof operand corrected to match a documented runtime invariant. The PR description confirms all other adstack consumers already use sizeof(u64), so this brings the last inconsistent site into alignment. No logic, algorithm, or API surface changes.

Other factors

No existing reviewer comments, no outstanding concerns, and the bug hunting system found no issues. The added comment clearly documents the rationale so future readers understand the constraint.

@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from 930d6d9 to a2a23ff Compare April 21, 2026 06:59
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from 0627f69 to 1caca8f Compare April 21, 2026 06:59
@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from a2a23ff to 462383b Compare April 21, 2026 07:19
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from 1caca8f to 4c5ee88 Compare April 21, 2026 07:19
@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from 462383b to 53a99bd Compare April 21, 2026 08:18
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from 4c5ee88 to 11d2006 Compare April 21, 2026 08:18
@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from 53a99bd to 8a4c10b Compare April 21, 2026 09:50
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from 11d2006 to 903e8e6 Compare April 21, 2026 09:50
@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from 8a4c10b to f6c9fe2 Compare April 21, 2026 12:02
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from 903e8e6 to a222483 Compare April 21, 2026 12:03
@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from f6c9fe2 to b502ac8 Compare April 21, 2026 14:42
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from a222483 to c4665c5 Compare April 21, 2026 14:42
@hughperkins
Copy link
Copy Markdown
Collaborator

Checklist:

  • just a bug fix => doesn't change existing API or usage
    • no need for doc changes
  • doesn't add significantly more code around diverse files

=> ok to merge

@duburcqa duburcqa force-pushed the duburcqa/fix_ad_correctness branch from b502ac8 to ed07103 Compare April 21, 2026 19:05
@duburcqa duburcqa force-pushed the duburcqa/split_llvm_adstack_header_size branch from c4665c5 to 8dc6a50 Compare April 21, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants