fix(gemma4): cast RoPE offset to int before mx.arange() by eauchs · Pull Request #4901 · unslothai/unsloth

eauchs · 2026-04-07T17:08:57Z

Problem

mx.arange() receives an mlx.core.array for offset instead of a
Python native int, causing a TypeError at inference time with Gemma 4 models.

Fix

Cast offset to int before passing to mx.arange().

Tested on

M3 Max 128GB — unsloth/gemma-4-31b-it-UD-MLX-4bit

gemini-code-assist

Code Review

This pull request modifies the position index generation in the Gemma4 text model by casting the offset to an integer. A review comment identifies that using int(offset) in MLX is inefficient because it triggers a CPU-GPU synchronization point and breaks compatibility with mx.compile. A suggestion was provided to use a zero-based range added to the offset to maintain performance and compilation support.

gemini-code-assist · 2026-04-07T17:10:43Z

unsloth/models/gemma4_text.py

        # x shape: (B, n_heads, L, head_dim)
        seq_len = x.shape[-2]
-        positions = mx.arange(offset, offset + seq_len, dtype = mx.float32)
+        positions = mx.arange(int(offset), int(offset) + seq_len, dtype = mx.float32)


Using int(offset) is discouraged in MLX because it forces a synchronization point between the GPU and CPU to retrieve the value, which can significantly degrade performance during inference. Furthermore, if this code is executed within an mx.compile block, int(offset) will fail if offset is a tracer array.

A more efficient and compilation-friendly approach is to generate a zero-based range and then add the offset. This avoids the TypeError with mx.arange while supporting both integer and array-based offsets without performance penalties.

Suggested change

positions = mx.arange(int(offset), int(offset) + seq_len, dtype = mx.float32)

positions = mx.arange(seq_len, dtype = mx.float32) + offset

Good catch — updated the fix to use mx.arange(seq_len) + offset to avoid the CPU-GPU sync point and maintain mx.compile compatibility

eauchs · 2026-04-07T17:15:36Z

@danielhanchen this fixes a TypeError crashing Gemma 4 inference for all users on the current fix/ui-fix branch — would appreciate a quick review 🙏

fix(gemma4): cast RoPE offset to int before mx.arange()

2e4e46f

gemini-code-assist bot reviewed Apr 7, 2026

View reviewed changes

fix(gemma4): use zero-based arange + offset to avoid CPU-GPU sync

76fa1bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gemma4): cast RoPE offset to int before mx.arange()#4901

fix(gemma4): cast RoPE offset to int before mx.arange()#4901
eauchs wants to merge 2 commits intounslothai:fix/ui-fixfrom
eauchs:fix/gemma4-rope-offset-int-cast

eauchs commented Apr 7, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 7, 2026

Uh oh!

eauchs Apr 7, 2026

Uh oh!

eauchs commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	positions = mx.arange(int(offset), int(offset) + seq_len, dtype = mx.float32)
	positions = mx.arange(seq_len, dtype = mx.float32) + offset

Uh oh!

Conversation

eauchs commented Apr 7, 2026

Problem

Fix

Tested on

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

eauchs Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

eauchs commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant