vllm_omni.core.sched.output ¶
OmniCachedRequestData dataclass ¶
OmniChunkRecvHandle dataclass ¶
Minimal identifier carried from scheduler to runner for chunk-recv registration.
The runner's register_chunk_recv only consumes request_id and external_req_id from each pending request, so we ship just those two fields instead of the full Request object. Concrete typing keeps msgspec serialization deterministic across IPC (default, PD-disagg, multi-node executor variants) and avoids the list[Any] fallback path.
OmniNewRequestData dataclass ¶
Bases: NewRequestData
New request data for omni models with embeddings support.
Extends NewRequestData to include additional information for direct transfer between pipeline stages.
Note: prompt_embeds is inherited from NewRequestData (torch.Tensor | None).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
external_req_id | str | None | Optional external request ID for tracking | None |
additional_information | AdditionalInformationPayload | None | Optional serialized additional information dictionary containing tensors or lists | None |
additional_information class-attribute instance-attribute ¶
additional_information: (
AdditionalInformationPayload | None
) = None
from_request classmethod ¶
from_request(
request: Request,
block_ids: tuple[list[int], ...],
prefill_token_ids: list[int] | None = None,
) -> OmniNewRequestData
Create OmniNewRequestData from a Request object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request | Request | Request object to convert | required |
block_ids | tuple[list[int], ...] | Tuple of block ID lists for KV cache allocation | required |
prefill_token_ids | list[int] | None | Optional prefill token IDs for v2 model runner | None |
Returns:
| Type | Description |
|---|---|
OmniNewRequestData | OmniNewRequestData instance with data from the request |
OmniSchedulerOutput dataclass ¶
Bases: SchedulerOutput
Scheduler output with omni-specific transfer metadata.