vllm.entrypoints.openai.engine.protocol ¶
Classes:
-
GenerationError–raised when finish_reason indicates internal server error (500)
-
PromptTokenUsageInfo–
Functions:
-
validate_structural_tag_response_format–Validate structural tags before they are sent to the engine.
GenerationError ¶
Bases: Exception
raised when finish_reason indicates internal server error (500)
Source code in vllm/entrypoints/openai/engine/protocol.py
PromptTokenUsageInfo ¶
Bases: OpenAIBaseModel
Attributes:
-
multimodal_tokens(dict[str, int] | None) –Prompt tokens contributed by each input modality, keyed by modality name
Source code in vllm/entrypoints/openai/engine/protocol.py
multimodal_tokens = None class-attribute instance-attribute ¶
Prompt tokens contributed by each input modality, keyed by modality name (e.g. image, audio, video). A breakdown of the multimodal placeholder tokens already counted in prompt_tokens; None when the request has no multimodal input.
validate_structural_tag_response_format(response_format) ¶
Validate structural tags before they are sent to the engine.
Engine-side validation reports malformed structural tags as generation failures. OpenAI request parsing should classify them as bad requests.