vllm_omni.engine.orchestrator ¶
Orchestrator for vLLM-Omni multi-stage runtime.
Runs inside a background thread with its own asyncio event loop. Owns logical request progression across stage pools and handles stage-to-stage transfer logic.
In distributed mode (coordinator_pub_address provided), it also owns the single :class:OmniCoordClientForHub, runs a :meth:_watch_replica_list task that converts replica disappearances into unregister_remote_replica control messages, and handles the register_remote_replica / unregister_remote_replica flow that attaches / detaches head-side stage clients for headless replicas.
Orchestrator ¶
Runs inside a background thread's asyncio event loop.
OrchestratorRequestState dataclass ¶
Per-request bookkeeping inside the Orchestrator.
final_output_stage_ids class-attribute instance-attribute ¶
finished_final_output_stage_ids class-attribute instance-attribute ¶
pd_prefill_multimodal_output class-attribute instance-attribute ¶
pipeline_timings class-attribute instance-attribute ¶
sampling_params_list class-attribute instance-attribute ¶
stage_submit_ts class-attribute instance-attribute ¶
streaming class-attribute instance-attribute ¶
streaming: StreamingInputState = field(
default_factory=lambda: StreamingInputState()
)
StreamingInputState dataclass ¶
build_engine_core_request_from_tokens ¶
build_engine_core_request_from_tokens(
request_id: str,
prompt: dict[str, Any],
params: SamplingParams | PoolingParams,
arrival_time: float | None = None,
model_config: ModelConfig | None = None,
resumable: bool = False,
mm_features: list | None = None,
) -> OmniEngineCoreRequest
Build an OmniEngineCoreRequest directly from an OmniTokensPrompt.