Skip to content

Support glob patterns in OnnxKQuantQuantization nodes_to_exclude#2518

Merged
justinchuby merged 2 commits into
mainfrom
justinchu/kquant-glob-exclude
Jun 18, 2026
Merged

Support glob patterns in OnnxKQuantQuantization nodes_to_exclude#2518
justinchuby merged 2 commits into
mainfrom
justinchu/kquant-glob-exclude

Conversation

@justinchuby

@justinchuby justinchuby commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Motivation

OnnxKQuantQuantization.nodes_to_exclude only matched exact node names. For models that are split into multiple ONNX components (e.g. multimodal decoder / vision / audio), keeping a specific layer out of INT4 required hardcoding build-specific node names such as vision_encoder/projector/MatMul_node_38 and audio_encoder/projector/MatMul_node_3. Those numeric suffixes are assigned at graph-build time and shift across builds and architectures, so the exclusion list is brittle.

Concrete case

For an encoder-free multimodal model (Gemma 4 gemma4_unified), the vision and audio "encoders" are a single projection MatMul each — i.e. the entire image/audio pathway. Measured INT4-vs-FP16 error on those projectors is non-trivial (rel-L2 ≈ 3.7% vision, ≈ 9.2% audio), while the components are tiny (76 MB / 1.4 MB), so keeping them in higher precision costs almost nothing. Doing that cleanly needs a robust way to target */projector/*.

Change

Allow each nodes_to_exclude entry to be a Unix shell-style glob pattern (matched with fnmatch.fnmatchcase) in addition to an exact node name. A node is excluded if its name equals or matches any entry, so existing exact-name configs are unaffected.

{
  "type": "OnnxKQuantQuantization",
  "bits": 4,
  "nodes_to_exclude": ["*/projector/*"]
}

Why glob and not regex

ONNX node names are slash-separated paths that are full of regex
metacharacters (e.g. decoder/model/layers.0/..., model.norm). Glob is
the safer and backward-compatible choice:

  • Backward compatibility. Existing nodes_to_exclude entries are exact
    names containing .. Under glob, . is a literal, so every existing
    config matches exactly the same node. Treating entries as regex would
    reinterpret . as "any char", silently changing the meaning of existing
    configs and risking over-exclusion (layers.0 would also match
    layers00). Glob is a clean superset: exact names behave identically and
    */? are the only added powers.
  • No escaping burden. Regex would force users to escape . in every
    dotted path name; glob needs none for the common case.
  • Intuitive anchoring. fnmatch matches the whole string, so
    */projector/* unambiguously means "whole name matches this shape",
    avoiding the re.match/re.search/re.fullmatch ambiguity.
  • Path-like names. Node names are paths, and glob is the familiar tool
    for path matching (shell, .gitignore).

Regex is strictly more expressive (alternation, character classes, numeric
ranges), but that is rarely needed here and can be added later without
breaking glob
(e.g. a re: prefix or a separate option) if a real need
appears.

Testing

  • Added test_kquant_with_nodes_to_exclude_glob (pattern *_1 excludes one of two MatMuls); the existing exact-name and uniform tests still pass — 4 passed.
  • ruff check clean on the changed files.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

The k-quant pass only matched nodes_to_exclude entries by exact node
name. For models split into multiple components (e.g. multimodal
decoder/vision/audio), excluding a layer required hardcoding
build-specific node names like 'vision_encoder/projector/MatMul_node_38',
which are brittle across builds and architectures.

Allow each nodes_to_exclude entry to be a Unix shell-style glob pattern
(matched with fnmatch.fnmatchcase) in addition to an exact name. A node
is excluded if its name equals or matches any entry, so existing
exact-name configs keep working. This makes it possible to write robust
exclusions such as '*/projector/*' to keep all projector MatMuls in
their original precision.

Adds a regression test covering glob-based exclusion.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Justin Chu <11205048+justinchuby@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 12, 2026 17:54

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the OnnxKQuantQuantization pass so nodes_to_exclude can match node names using Unix shell-style glob patterns (via fnmatch.fnmatchcase) in addition to exact-name matching, making exclusions more robust for graphs whose node names vary across builds.

Changes:

  • Add glob-pattern support to OnnxKQuantQuantization.nodes_to_exclude.
  • Expand nodes_to_exclude configuration help text with glob usage guidance.
  • Add a unit test verifying a glob pattern excludes only the intended node.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
olive/passes/onnx/kquant_quantization.py Implements glob-based matching for nodes_to_exclude and updates config documentation.
test/passes/onnx/test_kquant_quantization.py Adds coverage to ensure glob patterns in nodes_to_exclude behave as expected.

Comment thread olive/passes/onnx/kquant_quantization.py Outdated
justinchuby added a commit to microsoft/olive-recipes that referenced this pull request Jun 12, 2026
For the encoder-free gemma4_unified architecture, each of the vision and
audio 'encoders' is a single projector MatMul that forms the entire
image/audio embedding pathway. Quantizing it to INT4 injects
disproportionate error (measured rel-L2 ~3.7% vision / ~9.2% audio) while
the components are tiny (~76 MB / ~1.4 MB), so keeping them FP16 costs
almost nothing.

Exclude them via nodes_to_exclude: ['*/projector/*']. The decoder
(including lm_head) stays INT4, where the size savings live and INT4 has
negligible impact on output tokens (top-1 logit agreement ~100%, KL~0.004).

The glob form of nodes_to_exclude requires microsoft/Olive#2518; with older
Olive the pattern matches nothing and projectors are quantized as before.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Justin Chu <11205048+justinchuby@users.noreply.github.com>
Previously every nodes_to_exclude entry was matched both by exact name and
by fnmatch. An exact node name that happens to contain fnmatch
metacharacters (e.g. '[' / ']') could then unintentionally match unrelated
nodes (e.g. 'foo[1]' also matches 'foo1').

Only treat an entry as a glob pattern when it contains a '*' or '?'
wildcard; all other entries are matched by exact name as before. This keeps
exact names with bracket metacharacters safe while still supporting glob
exclusions like '*/projector/*'.

Adds a regression test ensuring a bracketed entry without a wildcard is
matched exactly (and therefore does not exclude an unrelated node).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Justin Chu <11205048+justinchuby@users.noreply.github.com>
@justinchuby justinchuby merged commit 2feb23d into main Jun 18, 2026
13 checks passed
@justinchuby justinchuby deleted the justinchu/kquant-glob-exclude branch June 18, 2026 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants