Ternary/int8 QAT on a JAX-NNX transformer (Bonsai-style 1.58-bit) by kmheckel · Pull Request #47 · kmheckel/spyx

kmheckel · 2026-07-04T15:25:37Z

Shows spyx.quant generalizes beyond spiking nets: its BitNet-ternary and int8 QAT apply unchanged to a decoder-only transformer (rules match by op on dot_general, so any nnx.Linear qualifies) — the same 1.58-bit-weight approach as PrismML's Bonsai LLMs.

research/new/ternary_llm/ — a tiny NNX GPT + a fair 3-way QAT comparison:

variant	val ppl
fp32	14.24
int8 weights	14.31
ternary	13.46

Ternary stays competitive with fp32. Quantization verified genuinely active (forward logits differ; quantized weights take few discrete codes — not a silent no-op). SMOKE=1 runs the full comparison on CPU in ~a minute; no new deps (reuses spyx.quant).

Honest caveat: qwix has no true 1.58-bit qtype, so 'ternary' is an int2 (4-code) approximation — disclosed in the study README. Pairs with the LiteRT export PR as the two edge-efficiency LOEs.

🤖 Generated with Claude Code

…8-bit) research/new/ternary_llm/ demonstrates spyx.quant generalizes beyond spiking nets: its BitNet-ternary and int8 QAT (bitnet_ternary_rules / weights_only_rules, matched by op on dot_general) apply unchanged to a tiny decoder-only transformer built from nnx.Linear — the same 1.58-bit-weight approach as PrismML's Bonsai LLMs. 3-way QAT comparison (same arch/seed/data): fp32 ppl 14.24, int8 14.31, ternary 13.46 — ternary stays competitive with fp32. Quantization verified genuinely active (forward logits differ; ternary weights take few discrete codes, not a no-op). SMOKE=1 runs the full comparison on CPU in ~a minute. Note: qwix has no true 1.58-bit qtype, so 'ternary' is an int2 (4-code) approximation — disclosed in the study. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chatgpt-codex-connector · 2026-07-04T15:25:41Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ternary/int8 QAT on a JAX-NNX transformer (Bonsai-style 1.58-bit)#47

Ternary/int8 QAT on a JAX-NNX transformer (Bonsai-style 1.58-bit)#47
kmheckel wants to merge 1 commit into
mainfrom
feat/ternary-llm-qat

kmheckel commented Jul 4, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kmheckel commented Jul 4, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant