`vllm.entrypoints.openai.engine.protocol` ¶

Classes:

GenerationError –

raised when finish_reason indicates internal server error (500)
PromptTokenUsageInfo –

Functions:

structured_outputs_from_response_format –

Apply response_format overrides to structured_outputs.
validate_structural_tag_response_format –

Validate structural tags before they are sent to the engine.

`GenerationError` ¶

Bases: Exception

raised when finish_reason indicates internal server error (500)

Source code in vllm/entrypoints/openai/engine/protocol.py

class GenerationError(Exception):
    """raised when finish_reason indicates internal server error (500)"""

    def __init__(self, message: str = "Internal server error"):
        super().__init__(message)
        self.status_code = HTTPStatus.INTERNAL_SERVER_ERROR

`PromptTokenUsageInfo` ¶

Bases: OpenAIBaseModel

Attributes:

multimodal_tokens (dict[str, int] | None) –

Prompt tokens contributed by each input modality, keyed by modality name

Source code in vllm/entrypoints/openai/engine/protocol.py

class PromptTokenUsageInfo(OpenAIBaseModel):
    cached_tokens: int | None = None
    created_cache_tokens: int | None = None
    multimodal_tokens: dict[str, int] | None = None
    """Prompt tokens contributed by each input modality, keyed by modality name
    (e.g. `image`, `audio`, `video`). A breakdown of the multimodal
    placeholder tokens already counted in `prompt_tokens`; `None` when the
    request has no multimodal input."""

`multimodal_tokens = None` `class-attribute` `instance-attribute` ¶

Prompt tokens contributed by each input modality, keyed by modality name (e.g. image, audio, video). A breakdown of the multimodal placeholder tokens already counted in prompt_tokens; None when the request has no multimodal input.

`structured_outputs_from_response_format(structured_outputs, response_format)` ¶

Apply response_format overrides to structured_outputs.

Source code in vllm/entrypoints/openai/engine/protocol.py

def structured_outputs_from_response_format(
    structured_outputs: StructuredOutputsParams | None,
    response_format: AnyResponseFormat | None,
) -> StructuredOutputsParams | None:
    """Apply ``response_format`` overrides to ``structured_outputs``."""
    if response_format is None or response_format.type == "text":
        return structured_outputs

    overrides: dict[str, Any]
    if response_format.type == "json_object":
        overrides = {"json_object": True}
    elif response_format.type == "json_schema":
        json_schema = response_format.json_schema
        assert json_schema is not None
        overrides = {"json": json_schema.json_schema}
    else:
        assert isinstance(
            response_format,
            (
                LegacyStructuralTagResponseFormat,
                StructuralTagResponseFormat,
            ),
        )
        overrides = {
            "structural_tag": json.dumps(response_format.model_dump(by_alias=True))
        }

    if structured_outputs is None:
        return StructuredOutputsParams(**overrides)

    return replace(structured_outputs, **overrides)

`validate_structural_tag_response_format(response_format)` ¶

Validate structural tags before they are sent to the engine.

Engine-side validation reports malformed structural tags as generation failures. OpenAI request parsing should classify them as bad requests.

Source code in vllm/entrypoints/openai/engine/protocol.py

def validate_structural_tag_response_format(
    response_format: AnyStructuralTagResponseFormat | dict[str, Any],
) -> None:
    """Validate structural tags before they are sent to the engine.

    Engine-side validation reports malformed structural tags as generation
    failures. OpenAI request parsing should classify them as bad requests.
    """
    from pydantic import TypeAdapter, ValidationError

    if isinstance(response_format, dict):
        try:
            response_format = TypeAdapter(
                AnyStructuralTagResponseFormat
            ).validate_python(response_format)
        except ValidationError as exc:
            raise VLLMValidationError(
                "Invalid response_format structural_tag specification.",
                parameter="response_format",
            ) from exc

    try:
        payload = json.dumps(response_format.model_dump(by_alias=True))
        validate_structural_tag_payload(payload, parameter="response_format")
    except (TypeError, ValueError) as exc:
        raise VLLMValidationError(
            "Invalid response_format structural_tag specification.",
            parameter="response_format",
        ) from exc

vllm.entrypoints.openai.engine.protocol ¶

GenerationError ¶

PromptTokenUsageInfo ¶

multimodal_tokens = None class-attribute instance-attribute ¶

structured_outputs_from_response_format(structured_outputs, response_format) ¶

validate_structural_tag_response_format(response_format) ¶

`vllm.entrypoints.openai.engine.protocol` ¶

`GenerationError` ¶

`PromptTokenUsageInfo` ¶

`multimodal_tokens = None` `class-attribute` `instance-attribute` ¶

`structured_outputs_from_response_format(structured_outputs, response_format)` ¶

`validate_structural_tag_response_format(response_format)` ¶