Skip to content

vllm_omni.core.sched.output

OmniCachedRequestData dataclass

Bases: CachedRequestData

Cached request data for omni models with embeddings support.

Parameters:

Name Type Description Default
prompt_token_ids dict[str, list[int]]

Mapping from request ID to list of prompt token IDs

required

additional_information instance-attribute

additional_information: dict[str, dict | None]

prompt_token_ids instance-attribute

prompt_token_ids: dict[str, list[int]]

OmniChunkRecvHandle dataclass

Minimal identifier carried from scheduler to runner for chunk-recv registration.

The runner's register_chunk_recv only consumes request_id and external_req_id from each pending request, so we ship just those two fields instead of the full Request object. Concrete typing keeps msgspec serialization deterministic across IPC (default, PD-disagg, multi-node executor variants) and avoids the list[Any] fallback path.

external_req_id class-attribute instance-attribute

external_req_id: str | None = None

request_id instance-attribute

request_id: str

OmniNewRequestData dataclass

Bases: NewRequestData

New request data for omni models with embeddings support.

Extends NewRequestData to include additional information for direct transfer between pipeline stages.

Note: prompt_embeds is inherited from NewRequestData (torch.Tensor | None).

Parameters:

Name Type Description Default
external_req_id str | None

Optional external request ID for tracking

None
additional_information AdditionalInformationPayload | None

Optional serialized additional information dictionary containing tensors or lists

None

additional_information class-attribute instance-attribute

additional_information: (
    AdditionalInformationPayload | None
) = None

external_req_id class-attribute instance-attribute

external_req_id: str | None = None

from_request classmethod

from_request(
    request: Request,
    block_ids: tuple[list[int], ...],
    prefill_token_ids: list[int] | None = None,
) -> OmniNewRequestData

Create OmniNewRequestData from a Request object.

Parameters:

Name Type Description Default
request Request

Request object to convert

required
block_ids tuple[list[int], ...]

Tuple of block ID lists for KV cache allocation

required
prefill_token_ids list[int] | None

Optional prefill token IDs for v2 model runner

None

Returns:

Type Description
OmniNewRequestData

OmniNewRequestData instance with data from the request

OmniSchedulerOutput dataclass

Bases: SchedulerOutput

Scheduler output with omni-specific transfer metadata.

finished_requests_needing_kv_transfer class-attribute instance-attribute

finished_requests_needing_kv_transfer: dict[str, dict] = (
    field(default_factory=dict)
)

pending_input_registrations class-attribute instance-attribute

pending_input_registrations: list[OmniChunkRecvHandle] = (
    field(default_factory=list)
)