fix(transformers): adds support for non-default head_dim in llama by suniastar · Pull Request #3593 · huggingface/candle

suniastar · 2026-06-09T11:18:50Z

Adds support for non-standard/non-fallback head_dim values.

head_dim (int, optional) — The attention head dimension. If None, it will default to hidden_size // num_attention_heads

This PR adds support for the optional values as well as its gguf metadata counterparts

llama.attention.key_length
llama.attention.value_length

for the quantized llama model.

As my goal was to get MN Violet Lotus running with candle I was only able to test the quantized implemention due to lack of (V)RAM.

adds support for llamas head_dim config

3d371fc

suniastar force-pushed the feat/llama-non-standard-head-dim branch from f42d676 to 3d371fc Compare June 11, 2026 09:40

suniastar changed the title ~~fix(llama): adds support for non-default head_dim~~ fix(transformers): adds support for non-default head_dim in llama Jun 11, 2026

suniastar closed this Jun 11, 2026

suniastar deleted the feat/llama-non-standard-head-dim branch June 11, 2026 10:43

suniastar mentioned this pull request Jun 11, 2026

fix(transformers): adds support for llamas head_dim config #3602

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(transformers): adds support for non-default head_dim in llama#3593

fix(transformers): adds support for non-default head_dim in llama#3593
suniastar wants to merge 1 commit into
huggingface:mainfrom
suniastar:feat/llama-non-standard-head-dim

suniastar commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

suniastar commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant