Skip to content

Feature: Support Gemma 4 thinking format in extractReasoningMiddleware #14217

@heiwen

Description

@heiwen

Summary

extractReasoningMiddleware cannot handle the reasoning token format used by Gemma 4 (available on Google Vertex AI). Gemma 4 uses asymmetric, non-XML tokens for its thinking output, which the current tagName-based API cannot express.

Background

Gemma 4 thinking format (from Vertex AI Model Garden):

  • Thinking is triggered by including <|think|> at the start of the system prompt.
  • Model output structure when thinking is enabled:
    <|channel>thought
    [internal reasoning content]<channel|>[final answer]
    
  • Opening token: <|channel>thought\n (literal string, includes a newline)
  • Closing token: <channel|>
  • Empty block (thinking disabled but model still emits tags):
    <|channel>thought
    <channel|>[final answer]
    

Problem

The current API only accepts a single tagName string and constructs symmetric <tagName>/</tagName> markers:

// packages/ai/src/middleware/extract-reasoning-middleware.ts:25-26
const openingTag = `<${tagName}>`;
const closingTag = `<\/${tagName}>`;

Gemma 4's opening token (<|channel>thought\n) and closing token (<channel|>) are:

  1. Asymmetric — opening and closing have completely different shapes.
  2. Not XML-compatible| and > placement doesn't match <tag>/</tag>.
  3. Include a newline in the opening token, which may affect streaming buffer logic.

There is no way to configure this today without forking the middleware.

Proposed Fix

Allow tagName to accept either a plain string (existing behaviour) or an object with opening and closing properties:

// existing usage (unchanged):
extractReasoningMiddleware({ tagName: 'think' })

// new usage for Gemma 4:
extractReasoningMiddleware({
  tagName: { opening: '<|channel>thought\n', closing: '<channel|>' },
})

The internal resolution becomes:

const openingTag =
  typeof tagName === 'string' ? `<${tagName}>` : tagName.opening;
const closingTag =
  typeof tagName === 'string' ? `<\/${tagName}>` : tagName.closing;

Special regex characters in the delimiter strings (e.g. |) must be escaped before use in wrapGenerate's new RegExp(...) call. The wrapStream path uses indexOf-based matching and requires no changes.

Multi-turn note

Gemma's docs state that thoughts must not appear in conversation history — only the final answer should be kept between turns. This matches the existing middleware behavior (reasoning is stripped from the text part), so no additional work is needed for multi-turn handling.

Acceptance Criteria

  • tagName accepts string | { opening: string; closing: string }.
  • Passing tagName: { opening: '<|channel>thought\n', closing: '<channel|>' } correctly extracts Gemma 4 reasoning in both generateText and streamText.
  • Existing tagName: string usage is fully backward-compatible (no breaking change).
  • Unit tests cover the Gemma 4 token format (including the empty-block case when thinking is disabled but tags are still emitted).
  • Changeset added (patch).
  • A examples/ai-functions/src/stream-text/google-vertex/gemma4-reasoning.ts example is added.

Affected file

  • packages/ai/src/middleware/extract-reasoning-middleware.ts

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    ai/corecore functions like generateText, streamText, etc. Provider utils, and provider spec.ai/providerrelated to a provider package. Must be assigned together with at least one `provider/*` labelfeatureNew feature or requestprovider/google-vertexIssues related to the @ai-sdk/google-vertex providersupport

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions