`vllm.tool_parsers.gemma4_utils` ¶

Gemma4 tool call parsing utilities for offline inference.

Standalone functions that parse decoded model text to extract tool calls from Gemma4 models. These are pure-Python utilities with zero heavy dependencies — they work on raw decoded strings from any inference backend (vLLM, HuggingFace, TGI, etc.).

For the OpenAI-compatible API server tool parser (streaming + non-streaming), see vllm.tool_parsers.gemma4_tool_parser. For thinking/reasoning output parsing, see vllm.reasoning.gemma4_utils.

Usage with vLLM offline inference::

from vllm import LLM, SamplingParams
from vllm.tool_parsers.gemma4_utils import (
    parse_tool_calls,
    has_tool_response_tag,
)

llm = LLM(model="google/gemma-4-it")
outputs = llm.generate(prompt, SamplingParams(...))
text = tokenizer.decode(outputs[0].outputs[0].token_ids, skip_special_tokens=False)

# Extract tool calls
tool_calls = parse_tool_calls(text)
for tc in tool_calls:
    print(f"{tc['name']}({tc['arguments']})")

Ported from transformers.models.gemma4.utils_gemma4 so that vLLM users do not need a transformers dependency for output parsing.

Functions:

has_tool_response_tag –

Check if model output properly ends with a tool response tag.
parse_tool_calls –

Parse tool calls from decoded Gemma4 model output.

`_parse_tool_arguments(args_str)` ¶

Parse tool call arguments from the Gemma4 compact format.

Delegates to the native <|"|>-aware parser from vllm.parser.gemma4, which handles internal quotes, nested objects, arrays, and all Gemma4 value types correctly.

Parameters:

args_str ¶
(str) –

Raw argument string from inside call:name{...}.

Returns:

dict[str, str] –

Dictionary of argument name → string value.

Source code in vllm/tool_parsers/gemma4_utils.py

def _parse_tool_arguments(args_str: str) -> dict[str, str]:
    """Parse tool call arguments from the Gemma4 compact format.

    Delegates to the native ``<|"|>``-aware parser from
    ``vllm.parser.gemma4``, which handles internal quotes, nested
    objects, arrays, and all Gemma4 value types correctly.

    Args:
        args_str: Raw argument string from inside ``call:name{...}``.

    Returns:
        Dictionary of argument name → string value.
    """
    if not args_str or not args_str.strip():
        return {}

    from vllm.parser.gemma4 import _parse_gemma4_args

    parsed = _parse_gemma4_args(args_str)
    return {k: str(v) if not isinstance(v, str) else v for k, v in parsed.items()}

`has_tool_response_tag(text)` ¶

Check if model output properly ends with a tool response tag.

Some Gemma4 models sometimes emit <eos> instead of <|tool_response> after a tool call. This helper detects whether the model used the proper termination, so callers can decide whether to inject <|tool_response> into the next prompt.

Parameters:

text ¶
(str) –

Decoded model output text.

Returns:

bool –

True if the output ends with <|tool_response>
bool –

(proper behavior), False otherwise.

Example::

>>> from vllm.tool_parsers.gemma4_utils import has_tool_response_tag
>>> if not has_tool_response_tag(model_output):
...     # Model used <eos> instead — inject <|tool_response> manually
...     next_prompt = "<|tool_response>" + tool_result

Source code in vllm/tool_parsers/gemma4_utils.py

def has_tool_response_tag(text: str) -> bool:
    """Check if model output properly ends with a tool response tag.

    Some Gemma4 models sometimes emit ``<eos>`` instead of
    ``<|tool_response>`` after a tool call. This helper detects
    whether the model used the proper termination, so callers can
    decide whether to inject ``<|tool_response>`` into the next prompt.

    Args:
        text: Decoded model output text.

    Returns:
        ``True`` if the output ends with ``<|tool_response>``
        (proper behavior), ``False`` otherwise.

    Example::

        >>> from vllm.tool_parsers.gemma4_utils import has_tool_response_tag
        >>> if not has_tool_response_tag(model_output):
        ...     # Model used <eos> instead — inject <|tool_response> manually
        ...     next_prompt = "<|tool_response>" + tool_result
    """
    stripped = text.rstrip()
    return stripped.endswith(_TOOL_RESPONSE_START_TAG)

`parse_tool_calls(text, *, strict=False)` ¶

Parse tool calls from decoded Gemma4 model output.

Uses a tiered parsing strategy to handle known output variations in Gemma4 models, which may emit non-standard tool call formats.

Parsing tiers

Standard: <|tool_call>call:name{args}<tool_call|> (special token IDs 48/49 in decoded text)
Fallback (when strict=False): bare call:name{args} patterns, including <call>name{args} (fragmented tokens from multimodal inputs)

Parameters:

text ¶
(str) –

Decoded model output text (from tokenizer.decode(..., skip_special_tokens=False)).
strict ¶
(bool, default: False ) –

If True, only match the standard <|tool_call> format. If False (default), also try fallback patterns for known Gemma4 output variations.

Returns:

list[dict] –

A list of dicts, each with keys: - "name": The tool function name (e.g. "get_weather"). - "arguments": A dict of argument name → value.

Example::

>>> from vllm.tool_parsers.gemma4_utils import parse_tool_calls
>>> output = tokenizer.decode(outputs[0], skip_special_tokens=False)
>>> tool_calls = parse_tool_calls(output)
>>> for tc in tool_calls:
...     print(f"Call: {tc['name']}({tc['arguments']})")

Source code in vllm/tool_parsers/gemma4_utils.py

def parse_tool_calls(text: str, *, strict: bool = False) -> list[dict]:
    """Parse tool calls from decoded Gemma4 model output.

    Uses a tiered parsing strategy to handle known output variations in
    Gemma4 models, which may emit
    non-standard tool call formats.

    Parsing tiers:
        1. **Standard**: ``<|tool_call>call:name{args}<tool_call|>``
           (special token IDs 48/49 in decoded text)
        2. **Fallback** (when ``strict=False``): bare ``call:name{args}``
           patterns, including ``<call>name{args}`` (fragmented tokens from
           multimodal inputs)

    Args:
        text: Decoded model output text (from ``tokenizer.decode(...,
            skip_special_tokens=False)``).
        strict: If ``True``, only match the standard ``<|tool_call>`` format.
            If ``False`` (default), also try fallback patterns for
            known Gemma4 output variations.

    Returns:
        A list of dicts, each with keys:
            - ``"name"``: The tool function name (e.g. ``"get_weather"``).
            - ``"arguments"``: A dict of argument name → value.

    Example::

        >>> from vllm.tool_parsers.gemma4_utils import parse_tool_calls
        >>> output = tokenizer.decode(outputs[0], skip_special_tokens=False)
        >>> tool_calls = parse_tool_calls(output)
        >>> for tc in tool_calls:
        ...     print(f"Call: {tc['name']}({tc['arguments']})")
    """
    results = []

    # Tier 1: Standard format with special tokens.
    # <|tool_call>call:name{args}<tool_call|>
    # Note: Some Gemma4 models emit <turn|> instead of <tool_call|>.
    standard_pattern = r"<\|tool_call\>call:(\w+)\{(.*?)\}(?:<tool_call\|>|<turn\|>)"
    for match in re.finditer(standard_pattern, text, re.DOTALL):
        name, args_str = match.group(1), match.group(2)
        results.append(
            {
                "name": name,
                "arguments": _parse_tool_arguments(args_str),
            }
        )

    if results or strict:
        return results

    # Tier 2: Fallback for known Gemma4 output variations.
    # Matches: <call>name{args}, call:name{args}, or bare call:name{args}<eos>
    fallback_pattern = r"(?:<call>|(?:^|\s)call:)(\w+)\{(.*?)\}"
    for match in re.finditer(fallback_pattern, text, re.DOTALL):
        name, args_str = match.group(1), match.group(2)
        results.append(
            {
                "name": name,
                "arguments": _parse_tool_arguments(args_str),
            }
        )

    return results

`vllm.tool_parsers.gemma4_utils` ¶

`_parse_tool_arguments(args_str)` ¶

`args_str` ¶

`has_tool_response_tag(text)` ¶

`text` ¶

`parse_tool_calls(text, *, strict=False)` ¶

`text` ¶

`strict` ¶

vllm.tool_parsers.gemma4_utils ¶

_parse_tool_arguments(args_str) ¶

args_str ¶

has_tool_response_tag(text) ¶

text ¶

parse_tool_calls(text, *, strict=False) ¶

text ¶

strict ¶

`vllm.tool_parsers.gemma4_utils` ¶

`_parse_tool_arguments(args_str)` ¶

`args_str` ¶

`has_tool_response_tag(text)` ¶

`text` ¶

`parse_tool_calls(text, *, strict=False)` ¶

`text` ¶

`strict` ¶