vllm_omni.engine ¶
Engine components for vLLM-Omni.
Modules:
| Name | Description |
|---|---|
arg_utils | |
async_omni_engine | Async Omni Engine for vLLM-Omni multi-stage runtime. |
cfg_companion_tracker | CFG companion request tracker for the Omni orchestrator. |
messages | |
mm_outputs | Multimodal output data structures for vLLM-Omni. |
omni_core_engine_proc_manager | Process manager for omni stage engine subprocesses. |
orchestrator | Orchestrator for vLLM-Omni multi-stage runtime. |
output_modality | Output modality types for vLLM-Omni. |
output_processor | |
serialization | Shared serialization helpers for omni engine request payloads. |
stage_client | Shared stage-client typing for vLLM-Omni runtime surfaces. |
stage_engine_core_client | Stage Engine Core Client for vLLM-Omni multi-stage runtime. |
stage_engine_core_proc | Stage Core Process for vLLM-Omni V1 architecture. |
stage_engine_startup | Helpers for launching and handshaking omni engine cores. |
stage_init_utils | Stage initialization helpers for vLLM-Omni multi-stage runtime. |
stage_pool | Unified stage-local runtime abstraction for vLLM-Omni. |
AdditionalInformationEntry ¶
Bases: Struct
One entry of additional_information.
Three supported forms are encoded
- tensor: data/shape/dtype
- list: a Python list (msgspec-serializable)
- scalar: a Python scalar (msgspec-serializable)
Exactly one of (tensor_data, list_data, scalar_data) should be non-None.
AdditionalInformationPayload ¶
Bases: Struct
Serialized dictionary payload for additional_information.
Keys are strings; values are encoded as AdditionalInformationEntry.
OmniEngineCoreOutput ¶
OmniEngineCoreOutputs ¶
Bases: EngineCoreOutputs
OmniEngineCoreRequest ¶
Bases: EngineCoreRequest
Engine core request for omni models with embeddings support.
Extends the base EngineCoreRequest with support for additional information payloads, enabling direct transfer of pre-computed data between pipeline stages.
Note: prompt_embeds is inherited from EngineCoreRequest (torch.Tensor | None). PromptEmbedsPayload should be decoded to torch.Tensor before constructing this request.
Attributes:
| Name | Type | Description |
|---|---|---|
additional_information | AdditionalInformationPayload | None | Optional serialized additional information dictionary containing tensors or lists to pass along with the request |
additional_information class-attribute instance-attribute ¶
additional_information: (
AdditionalInformationPayload | None
) = None
from_request classmethod ¶
from_request(
request: EngineCoreRequest,
*,
prompt_embeds: Tensor | None = None,
additional_information: AdditionalInformationPayload
| None = None,
) -> OmniEngineCoreRequest
Clone an EngineCoreRequest into an OmniEngineCoreRequest with optional payload overrides.
PromptEmbedsPayload ¶
Bases: Struct
Serialized prompt embeddings payload for direct transfer.
data: raw bytes of the tensor in row-major order shape: [seq_len, hidden_size] dtype: torch dtype name (e.g., "float16", "float32")