Skip to content

fix(transformers): adds support for non-default head_dim in llama#3593

Closed
suniastar wants to merge 1 commit into
huggingface:mainfrom
suniastar:feat/llama-non-standard-head-dim
Closed

fix(transformers): adds support for non-default head_dim in llama#3593
suniastar wants to merge 1 commit into
huggingface:mainfrom
suniastar:feat/llama-non-standard-head-dim

Conversation

@suniastar

Copy link
Copy Markdown

Adds support for non-standard/non-fallback head_dim values.

on huggingface head_dim is defined as

head_dim (int, optional) — The attention head dimension. If None, it will default to hidden_size // num_attention_heads

This PR adds support for the optional values as well as its gguf metadata counterparts

llama.attention.key_length
llama.attention.value_length

for the quantized llama model.

As my goal was to get MN Violet Lotus running with candle I was only able to test the quantized implemention due to lack of (V)RAM.

@suniastar suniastar force-pushed the feat/llama-non-standard-head-dim branch from f42d676 to 3d371fc Compare June 11, 2026 09:40
@suniastar suniastar changed the title fix(llama): adds support for non-default head_dim fix(transformers): adds support for non-default head_dim in llama Jun 11, 2026
@suniastar suniastar closed this Jun 11, 2026
@suniastar suniastar deleted the feat/llama-non-standard-head-dim branch June 11, 2026 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant