vllm_omni.engine.arg_utils ¶
SHARED_FIELDS module-attribute ¶
SHARED_FIELDS: frozenset[str] = frozenset(
{
"model",
"stage_id",
"log_stats",
"stage_configs_path",
"async_chunk",
"tokenizer",
}
)
OmniAsyncEngineArgs dataclass ¶
Bases: AsyncEngineArgs, OmniEngineArgs
output_modality property ¶
output_modality: OutputModality
Parse engine_output_type into a type-safe OutputModality flag.
OmniEngineArgs dataclass ¶
Bases: EngineArgs
Engine arguments for omni models, extending base EngineArgs. Adds omni-specific configuration fields for multi-stage pipeline processing and output type specification. Args: stage_id: Identifier for the stage in a multi-stage pipeline. Defaults to 0 for per-stage engine construction. The CLI-level single-stage selector remains optional on the parsed argparse namespace and should not be forwarded as a nullable per-stage engine argument. model_stage: Stage type identifier, e.g., "thinker" or "talker" (default: "thinker") model_arch: Model architecture name (default: "Qwen2_5OmniForConditionalGeneration") engine_output_type: Optional output type specification for the engine. Used to route outputs to appropriate processors (e.g., "image", "audio", "latents"). If None, output type is inferred. hf_config_name: Optional key for HF config subkey to be extracted for this stage, e.g., talker_config; If None, the default HF config will be used. custom_process_next_stage_input_func: Optional path to a custom function for processing inputs from previous stages If None, default processing is used. stage_connector_spec: Extra configuration for stage connector async_chunk: If set to True, perform async chunk worker_type: Model Type, e.g., "ar" or "generation" task_type: Default task type for TTS models (CustomVoice, VoiceDesign, or Base). If not specified, will be inferred from model path. omni_master_address: TCP address that the OmniMasterServer (running inside AsyncOmniEngine) listens on for engine core registrations. Required when single-stage mode is active. omni_master_port: TCP port for the OmniMasterServer registration socket. Required when single-stage mode is active. stage_configs_path: Optional path to a JSON/YAML file containing stage configurations for the multi-stage pipeline. If None, stage configs are resolved from the model's default configuration. output_modalities: Optional list of output modality names to enable (e.g. ["text", "audio"]). If None, all modalities supported by the model are used. log_stats: Whether to log engine statistics. Defaults to False. custom_pipeline_args: Dictionary of arguments for custom pipeline initialization (e.g., {"pipeline_class": "my.Module"}). Passed through to the diffusion stage engine.
custom_pipeline_args class-attribute instance-attribute ¶
custom_process_next_stage_input_func class-attribute instance-attribute ¶
custom_process_next_stage_input_func: str | None = None
stage_connector_spec class-attribute instance-attribute ¶
subtalker_sampling_params class-attribute instance-attribute ¶
create_model_config ¶
create_model_config() -> OmniModelConfig
Create an OmniModelConfig from these engine arguments. Returns: OmniModelConfig instance with all configuration fields set
OrchestratorArgs dataclass ¶
CLI flags consumed by the orchestrator.
every field here is either
(a) orchestrator-only (never needed by a stage engine), OR (b) orchestrator-read-then-redistributed (e.g. async_chunk is read from CLI, written to DeployConfig, then propagated to every stage via merge_pipeline_deploy — not via direct kwargs forwarding).
Fields that BOTH orchestrator and engine genuinely need (e.g. model, log_stats) should be listed in SHARED_FIELDS below.
auxiliary_text_encoder class-attribute instance-attribute ¶
auxiliary_text_encoder: str | None = None
default_sampling_params class-attribute instance-attribute ¶
default_sampling_params: str | None = None
diffusion_attention_backend class-attribute instance-attribute ¶
diffusion_attention_backend: str | None = None
diffusion_attention_config class-attribute instance-attribute ¶
diffusion_attention_config: str | None = None
diffusion_kv_cache_dtype class-attribute instance-attribute ¶
diffusion_kv_cache_dtype: str | None = None
diffusion_kv_cache_skip_layers class-attribute instance-attribute ¶
diffusion_kv_cache_skip_layers: str | None = None
diffusion_kv_cache_skip_steps class-attribute instance-attribute ¶
diffusion_kv_cache_skip_steps: str | None = None
diffusion_quantization_config class-attribute instance-attribute ¶
diffusion_quantization_config: str | None = None
enable_cache_dit_summary class-attribute instance-attribute ¶
enable_cache_dit_summary: bool = False
enable_diffusion_pipeline_profiler class-attribute instance-attribute ¶
enable_diffusion_pipeline_profiler: bool = False
enable_layerwise_offload class-attribute instance-attribute ¶
enable_layerwise_offload: bool = False
enable_multithread_weight_load class-attribute instance-attribute ¶
enable_multithread_weight_load: bool = True
max_generated_image_size class-attribute instance-attribute ¶
max_generated_image_size: int | None = None
internal_blacklist_keys ¶
Return the set of CLI keys that must never be forwarded as per-stage engine overrides.
Derived from OrchestratorArgs fields minus SHARED_FIELDS, so adding a new orchestrator-owned flag is a one-line change to the dataclass — this function updates automatically.
orchestrator_field_names ¶
Return the names of every field on OrchestratorArgs.