vllm_omni.engine.mm_outputs ¶
Multimodal output data structures for vLLM-Omni.
This module defines structured types for multimodal outputs.
MultimodalCompletionOutput dataclass ¶
Bases: CompletionOutput
CompletionOutput with multimodal support.
Inherits all CompletionOutput fields and adds multimodal_output. As a CompletionOutput subclass, compatible with all existing vLLM consumers.
MultimodalPayload dataclass ¶
Structured multimodal output payload.
Attributes:
| Name | Type | Description |
|---|---|---|
tensors | dict[str, Tensor] | Dictionary mapping modality/key names to their tensors. |
metadata | dict[str, Any] | Optional dictionary for non-tensor metadata (e.g., sample rate for audio, image dimensions). |
metadata class-attribute instance-attribute ¶
primary_tensor property ¶
Return the first tensor in the payload, or None if empty.
tensors class-attribute instance-attribute ¶
from_dict classmethod ¶
from_dict(
data: dict[str, Any] | None,
) -> MultimodalPayload | None
Create a MultimodalPayload from a raw dictionary.
Separates torch.Tensor values into tensors and everything else into metadata.