Skip to content

vllm_omni.data_entry_keys

Structured payload types for inter-stage communication.

Categories under OmniPayload: hidden_states – intermediate / output hidden-state tensors embed – embedding tensors (prefill, decode, special tokens) ids – token-ID sequences codes – codec / audio code tensors meta – scalar metadata, control flags, shapes

Codes

Bases: TypedDict

audio instance-attribute

audio: Tensor

ref instance-attribute

ref: Tensor

CodesStruct

Bases: _StructBase

audio class-attribute instance-attribute

audio: Tensor | None = None

ref class-attribute instance-attribute

ref: Tensor | None = None

Embeddings

Bases: TypedDict

cached_decode instance-attribute

cached_decode: Tensor

decode instance-attribute

decode: Tensor

embedding instance-attribute

embedding: Tensor

prefill instance-attribute

prefill: Tensor

speech_feat instance-attribute

speech_feat: Tensor

speech_token instance-attribute

speech_token: Tensor

thinker_reply instance-attribute

thinker_reply: Tensor

tts_bos instance-attribute

tts_bos: Tensor

tts_eos instance-attribute

tts_eos: Tensor

tts_pad instance-attribute

tts_pad: Tensor

tts_pad_projected instance-attribute

tts_pad_projected: Tensor

voice instance-attribute

voice: Tensor

EmbeddingsStruct

Bases: _StructBase

cached_decode class-attribute instance-attribute

cached_decode: Tensor | None = None

decode class-attribute instance-attribute

decode: Tensor | None = None

decode_token_end class-attribute instance-attribute

decode_token_end: int | None = None

decode_token_start class-attribute instance-attribute

decode_token_start: int | None = None

embedding class-attribute instance-attribute

embedding: Tensor | None = None

prefill class-attribute instance-attribute

prefill: Tensor | None = None

prefill_shape class-attribute instance-attribute

prefill_shape: list[int] | None = None

speech_feat class-attribute instance-attribute

speech_feat: Tensor | None = None

speech_token class-attribute instance-attribute

speech_token: Tensor | None = None

thinker_reply class-attribute instance-attribute

thinker_reply: Tensor | None = None

tts_bos class-attribute instance-attribute

tts_bos: Tensor | None = None

tts_eos class-attribute instance-attribute

tts_eos: Tensor | None = None

tts_pad class-attribute instance-attribute

tts_pad: Tensor | None = None

tts_pad_projected class-attribute instance-attribute

tts_pad_projected: Tensor | None = None

voice class-attribute instance-attribute

voice: Tensor | None = None

HiddenStates

Bases: TypedDict

last instance-attribute

last: Tensor

layers instance-attribute

layers: dict[int, Tensor]

output instance-attribute

output: Tensor

trailing_text instance-attribute

trailing_text: Tensor

HiddenStatesStruct

Bases: _StructBase

last class-attribute instance-attribute

last: Tensor | None = None

layers class-attribute instance-attribute

layers: dict[int, Tensor] | None = None

output class-attribute instance-attribute

output: Tensor | None = None

output_shape class-attribute instance-attribute

output_shape: list[int] | None = None

trailing_text class-attribute instance-attribute

trailing_text: Tensor | None = None

Ids

Bases: TypedDict

all instance-attribute

all: list[int]

output instance-attribute

output: list[int]

prior_image instance-attribute

prior_image: list[int]

prompt instance-attribute

prompt: list[int]

speech_token instance-attribute

speech_token: list[int]

IdsStruct

Bases: _StructBase

all class-attribute instance-attribute

all: list[int] | None = None

output class-attribute instance-attribute

output: list[int] | None = None

prior_image class-attribute instance-attribute

prior_image: list[int] | None = None

prompt class-attribute instance-attribute

prompt: list[int] | None = None

speech_token class-attribute instance-attribute

speech_token: list[int] | None = None

MetaStruct

Bases: _StructBase

ar_width class-attribute instance-attribute

ar_width: int | None = None

code_flat_numel class-attribute instance-attribute

code_flat_numel: int | None = None

codec_chunk_frames class-attribute instance-attribute

codec_chunk_frames: int | None = None

codec_left_context_frames class-attribute instance-attribute

codec_left_context_frames: int | None = None

codec_streaming class-attribute instance-attribute

codec_streaming: bool | None = None

decode_flag class-attribute instance-attribute

decode_flag: bool | None = None

eol_token_id class-attribute instance-attribute

eol_token_id: int | None = None

finished class-attribute instance-attribute

finished: Tensor | None = None

gen_token_mask class-attribute instance-attribute

gen_token_mask: Tensor | None = None

height class-attribute instance-attribute

height: int | None = None

is_segment_finished class-attribute instance-attribute

is_segment_finished: Tensor | None = None

left_context_size class-attribute instance-attribute

left_context_size: int | None = None

next_stage_prompt_len class-attribute instance-attribute

next_stage_prompt_len: int | None = None

num_processed_tokens class-attribute instance-attribute

num_processed_tokens: int | None = None

omni_final_stage_id class-attribute instance-attribute

omni_final_stage_id: int | None = None

omni_task class-attribute instance-attribute

omni_task: list[str] | None = None

override_keys class-attribute instance-attribute

override_keys: list[tuple[str, str]] | None = None

ref_code_len class-attribute instance-attribute

ref_code_len: int | None = None

ref_context_included class-attribute instance-attribute

ref_context_included: bool | None = None

ref_context_request_id class-attribute instance-attribute

ref_context_request_id: str | None = None

ref_context_size class-attribute instance-attribute

ref_context_size: int | None = None

req_id class-attribute instance-attribute

req_id: list[str] | None = None

stream_finished class-attribute instance-attribute

stream_finished: Tensor | None = None

talker_prefill_offset class-attribute instance-attribute

talker_prefill_offset: int | None = None

visual_token_end_id class-attribute instance-attribute

visual_token_end_id: int | None = None

visual_token_start_id class-attribute instance-attribute

visual_token_start_id: int | None = None

width class-attribute instance-attribute

width: int | None = None

OmniPayload

Bases: TypedDict

codes instance-attribute

codes: Codes

embed instance-attribute

embed: Embeddings

generated_len instance-attribute

generated_len: int

hidden_states instance-attribute

hidden_states: HiddenStates

ids instance-attribute

ids: Ids

language instance-attribute

language: Any

latent instance-attribute

latent: Tensor

meta instance-attribute

model_outputs instance-attribute

model_outputs: list[Tensor]

mtp_inputs instance-attribute

mtp_inputs: tuple[Tensor, Tensor]

request_id instance-attribute

request_id: str

speaker instance-attribute

speaker: Any

OmniPayloadMeta

Bases: TypedDict

ar_width instance-attribute

ar_width: int

codec_streaming instance-attribute

codec_streaming: bool

decode_flag instance-attribute

decode_flag: bool

eol_token_id instance-attribute

eol_token_id: int

finished instance-attribute

finished: Tensor

gen_token_mask instance-attribute

gen_token_mask: Tensor

height instance-attribute

height: int

is_segment_finished instance-attribute

is_segment_finished: Tensor

left_context_size instance-attribute

left_context_size: int

next_stage_prompt_len instance-attribute

next_stage_prompt_len: int

num_processed_tokens instance-attribute

num_processed_tokens: int

omni_task instance-attribute

omni_task: list[str]

override_keys instance-attribute

override_keys: list[tuple[str, str]]

ref_code_len instance-attribute

ref_code_len: int

ref_context_included instance-attribute

ref_context_included: bool

ref_context_request_id instance-attribute

ref_context_request_id: str

ref_context_size instance-attribute

ref_context_size: int

req_id instance-attribute

req_id: list[str]

stream_finished instance-attribute

stream_finished: Tensor

talker_prefill_offset instance-attribute

talker_prefill_offset: int

visual_token_end_id instance-attribute

visual_token_end_id: int

visual_token_start_id instance-attribute

visual_token_start_id: int

width instance-attribute

width: int

OmniPayloadStruct

Bases: _StructBase

codes class-attribute instance-attribute

codes: CodesStruct | None = None

embed class-attribute instance-attribute

embed: EmbeddingsStruct | None = None

generated_len class-attribute instance-attribute

generated_len: int | None = None

hidden class-attribute instance-attribute

hidden: Tensor | None = None

hidden_states class-attribute instance-attribute

hidden_states: HiddenStatesStruct | None = None

ids class-attribute instance-attribute

ids: IdsStruct | None = None

kv_metadata class-attribute instance-attribute

kv_metadata: dict[str, Any] | None = None

language class-attribute instance-attribute

language: list[str] | str | None = None

latent class-attribute instance-attribute

latent: Tensor | None = None

meta class-attribute instance-attribute

meta: MetaStruct | None = None

model_outputs class-attribute instance-attribute

model_outputs: list[Tensor] | None = None

mtp_inputs class-attribute instance-attribute

mtp_inputs: tuple[Tensor, Tensor] | None = None

past_key_values class-attribute instance-attribute

past_key_values: list[int] | None = None

request_id class-attribute instance-attribute

request_id: str | None = None

speaker class-attribute instance-attribute

speaker: list[str] | str | None = None

deserialize_payload

deserialize_payload(
    wire: AdditionalInformationPayload,
) -> OmniPayload

Deserialize an AdditionalInformationPayload back to OmniPayload.

Decodes entries to tensors/lists, then uses :func:unflatten_payload to reconstruct the nested structure.

flatten_payload

flatten_payload(payload: dict[str, Any]) -> dict[str, Any]

Flatten a nested OmniPayload to dotted keys.

Nested sub-dicts under _NESTED_KEYS are expanded: {"codes": {"audio": tensor}}{"codes.audio": tensor}. hidden_states["layers"] is expanded to hidden_states.layer_N. Top-level values are kept as-is.

serialize_payload

serialize_payload(
    payload: OmniPayload,
) -> AdditionalInformationPayload | None

Serialize an OmniPayload for EngineCore transport.

Uses :func:flatten_payload to produce dotted keys, then converts each value to an AdditionalInformationEntry.

to_dict

to_dict(struct: OmniPayloadStruct) -> dict[str, Any]

Convert OmniPayloadStruct to a plain dict, dropping None fields.

to_struct

to_struct(payload: dict[str, Any]) -> OmniPayloadStruct

Convert a payload dict into OmniPayloadStruct, validating types.

Raises msgspec.ValidationError on: * unknown top-level keys (typos, legacy flat keys) * unknown sub-keys under any nested category * type mismatches (e.g., meta.left_context_size not an int)

unflatten_payload

unflatten_payload(flat: dict[str, Any]) -> dict[str, Any]

Unflatten dotted keys back to nested dicts.

Reverse of :func:flatten_payload. hidden_states.layer_N keys are collected into hidden_states.layers.

validate_payload

validate_payload(
    payload: dict[str, Any] | None,
    *,
    context: str = "payload",
) -> None

Validate a payload matches the OmniPayload schema, raising on drift.

Wraps :func:to_struct and re-raises msgspec.ValidationError with the call-site context prepended. None is allowed (treated as "no payload to check").