Skip to content

fix(gguf): bound tensor size against file before allocating#3588

Open
pjdurden wants to merge 2 commits into
huggingface:mainfrom
pjdurden:fix/gguf-tensorinfo-alloc-overflow
Open

fix(gguf): bound tensor size against file before allocating#3588
pjdurden wants to merge 2 commits into
huggingface:mainfrom
pjdurden:fix/gguf-tensorinfo-alloc-overflow

Conversation

@pjdurden

@pjdurden pjdurden commented Jun 7, 2026

Copy link
Copy Markdown

What

Harden the GGUF loader's lazy tensor-data path (TensorInfo::read) against
integer overflow and unbounded allocation driven by a crafted file.

This is the follow-up to #3533 that #3556 didn't reach. #3556 capped the
allocations in Content::read (string/array lengths, tensor/metadata counts,
nesting depth) and added the remaining_bytes helper, but the tensor-data
allocation in TensorInfo::read was left unguarded — exactly the gap called
out in the #3533 discussion.

The bug

Tensor dimensions are read straight from the file and bounded only in count
(GGUF_MAX_TENSOR_DIMS = 4), never in value. TensorInfo::read then did:

let tensor_elems = self.shape.elem_count();          // wrapping product of dims
let size_in_bytes = tensor_elems / block_size * self.ggml_dtype.type_size();
let mut raw_data = vec![0u8; size_in_bytes];          // unguarded allocation

Two problems on untrusted input:

  1. Overflow. elem_count() multiplies the dims with wrapping arithmetic,
    and * type_size() can overflow too. A crafted file (e.g. two 1<<33 dims,
    or a single 1<<62 dim) wraps size_in_bytes to a bogus value in release
    builds and panics in debug builds.
  2. Unbounded allocation. size_in_bytes was never gated against the bytes
    physically left in the file, so a large declared size drives a huge
    allocation before read_exact ever runs — the same DoS class Cap allocations in the GGUF loader #3556 fixed
    for strings/arrays.

The fix

  • Compute the element and byte counts with checked arithmetic, returning a
    clean error instead of wrapping/panicking.
  • Reject the declared size when it exceeds remaining_bytes, mirroring the
    checks already in read_string / Value::read, so the allocation can't
    exceed what's in the file.

The public signature of TensorInfo::read is unchanged.

Tests

Adds two regression tests to the existing GGUF hardening suite
(candle-core/tests/gguf_tests.rs):

  • rejects_tensor_with_elem_count_overflow — dims whose product overflows
    (panics on main in debug; clean Err after the fix).
  • rejects_tensor_size_above_remaining_file_bytes — a declared size larger
    than the file is rejected before allocating.

Verified RED before the fix, GREEN after. Full gguf_tests (11) and
quantized_tests (37) pass; cargo fmt --check and cargo clippy --tests
are clean.

Copilot AI review requested due to automatic review settings June 7, 2026 00:22

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds stronger validation when reading GGUF tensor payloads to prevent overflows and oversized allocations from crafted/untrusted files.

Changes:

  • Compute GGUF tensor element/byte sizes with checked arithmetic and return clean errors on overflow.
  • Reject tensors whose declared payload size exceeds remaining file bytes before allocating/reading.
  • Add regression tests covering element-count overflow and oversized tensor payload declarations.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
candle-core/src/quantized/gguf_file.rs Adds checked size computation and remaining-bytes gating before allocation/reads.
candle-core/tests/gguf_tests.rs Adds regression tests and helpers to exercise new overflow/size checks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread candle-core/src/quantized/gguf_file.rs Outdated
Comment thread candle-core/src/quantized/gguf_file.rs Outdated
Comment thread candle-core/src/quantized/gguf_file.rs
pjdurden added a commit to pjdurden/candle that referenced this pull request Jun 7, 2026
Address review feedback on huggingface#3588:
- `data_start`: `checked_add` instead of `saturating_add` so a malformed
  tensor offset reports an explicit "offset overflows" error rather than
  clamping to u64::MAX and surfacing a misleading "exceeds remaining file
  bytes" message.
- `size_in_bytes`: convert u64 -> usize with `usize::try_from` and a clean
  error, instead of `as` which would truncate on 32-bit targets (declared
  size > usize::MAX) and under-allocate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pjdurden

Copy link
Copy Markdown
Author

@ivarflakstad gentle nudge on this one when you have bandwidth. Small untrusted input hardening fix with two regression tests, Copilot comments already addressed.

pjdurden and others added 2 commits June 12, 2026 05:59
The GGUF loader's allocation caps from huggingface#3556 covered `Content::read`
(string/array lengths, tensor/metadata counts, nesting depth) but not the
lazy tensor-data path in `TensorInfo::read`, as noted on huggingface#3533.

Tensor dimensions are read from the file and bounded only in *count*
(`GGUF_MAX_TENSOR_DIMS`), not in *value*. `TensorInfo::read` then did:

    let tensor_elems = self.shape.elem_count();          // wrapping product of dims
    let size_in_bytes = tensor_elems / block_size * type_size;
    let mut raw_data = vec![0u8; size_in_bytes];          // unguarded allocation

A crafted file can make `elem_count()` (or the `* type_size`) overflow
`usize` — wrapping to a bogus size in release, panicking in debug — and the
allocation was never gated against the remaining file size, so a large
declared size drives a huge allocation before `read_exact` ever runs.

Compute the element and byte counts with checked arithmetic and reject the
declared size when it exceeds the bytes physically left in the file, mirroring
the `remaining_bytes` checks already used by `read_string`/`Value::read`. The
public signature of `TensorInfo::read` is unchanged.

Adds regression tests for both the overflow and oversized-vs-file cases to the
existing GGUF hardening suite.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review feedback on huggingface#3588:
- `data_start`: `checked_add` instead of `saturating_add` so a malformed
  tensor offset reports an explicit "offset overflows" error rather than
  clamping to u64::MAX and surfacing a misleading "exceeds remaining file
  bytes" message.
- `size_in_bytes`: convert u64 -> usize with `usize::try_from` and a clean
  error, instead of `as` which would truncate on 32-bit targets (declared
  size > usize::MAX) and under-allocate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pjdurden pjdurden force-pushed the fix/gguf-tensorinfo-alloc-overflow branch from e8ce1c9 to f084e16 Compare June 12, 2026 00:36
@pjdurden

pjdurden commented Jun 12, 2026

Copy link
Copy Markdown
Author

Rebased. this now only adds the checked element-count/byte-size/offset arithmetic that the merged size check still does with wrapping usize math. tests are now working. cc @ivarflakstad @EricLBuehler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants