OOM when loading z-lab/gemma-4-26B-A4B-it-PARO

I tried describing it in detail at https://github.com/jundot/omlx/issues/1757
Not sure where the bug(s) lie but I think there are several and when they combine they are quite deadly :)

1) memory explosion upon load without release on failure
2) VLM->LLM fallback causing even more memory use
3) subsequent OOM when it doesn't fit
4) successful release when it does

I don't know if vision is supposed to work or not in this quant, but it works with gemma-4-26B-A4B-it-oQ8



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OOM when loading z-lab/gemma-4-26B-A4B-it-PARO #51

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

OOM when loading z-lab/gemma-4-26B-A4B-it-PARO #51

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions