Skip to content

OOM when loading z-lab/gemma-4-26B-A4B-it-PARO #51

Description

@zviratko

I tried describing it in detail at jundot/omlx#1757
Not sure where the bug(s) lie but I think there are several and when they combine they are quite deadly :)

  1. memory explosion upon load without release on failure
  2. VLM->LLM fallback causing even more memory use
  3. subsequent OOM when it doesn't fit
  4. successful release when it does

I don't know if vision is supposed to work or not in this quant, but it works with gemma-4-26B-A4B-it-oQ8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions