vllm_omni.request ¶
OmniRequest ¶
Bases: Request
Request class for omni models, extending the base Request.
This class extends the base vLLM Request with support for prompt embeddings and additional information payloads, enabling direct transfer of pre-computed embeddings between stages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt_embeds | PromptEmbedsPayload | Tensor | None | Optional serialized prompt embeddings payload. Used for direct transfer of embeddings between stages. | None |
additional_information | AdditionalInformationPayload | None | Optional additional information payload containing tensors or lists to be passed along with the request. | None |
additional_information instance-attribute ¶
additional_information: (
AdditionalInformationPayload | None
) = additional_information
prompt_embeds_payload instance-attribute ¶
prompt_embeds_payload: PromptEmbedsPayload | None = (
prompt_embeds
if isinstance(prompt_embeds, PromptEmbedsPayload)
else None
)
from_engine_core_request classmethod ¶
from_engine_core_request(
request: OmniEngineCoreRequest,
block_hasher: Callable[[Request], list[BlockHash]]
| None,
) -> Request
Create an OmniRequest from an OmniEngineCoreRequest.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request | OmniEngineCoreRequest | The OmniEngineCoreRequest to convert | required |
block_hasher | Callable[[Request], list[BlockHash]] | None | Optional function to compute block hashes for prefix caching | required |
Returns:
| Type | Description |
|---|---|
Request | OmniRequest instance created from the engine core request |
OmniStreamingUpdate dataclass ¶
Override: add additional information Lightweight data for streaming session continuation.
Contains only the fields needed to update an existing streaming session with new input data.
additional_information class-attribute instance-attribute ¶
additional_information: (
AdditionalInformationPayload | None
) = None