Skip to content

Configuration Options

This section lists the most common options for running vLLM-Omni.

For options within a vLLM Engine. Please refer to vLLM Configuration

Currently, the main options are maintained by stage configs for each model.

For a specific example, see the Qwen2.5-Omni deploy config. The matching frozen pipeline topology lives at vllm_omni/model_executor/models/qwen2_5_omni/pipeline.py.

For introduction, please check Introduction for stage config

Memory Configuration

Multi-Stage Recipes

  • Prefill-Decode Disaggregation - How to derive a PD-aware Qwen3-Omni stage config from the default config without introducing another bundled YAML

Optimization Features