vllm_omni.platforms ¶
Modules:
| Name | Description |
|---|---|
cuda | |
interface | |
musa | |
npu | |
rocm | |
xpu | |
builtin_omni_platform_plugins module-attribute ¶
builtin_omni_platform_plugins = {
"cuda": cuda_omni_platform_plugin,
"rocm": rocm_omni_platform_plugin,
"npu": npu_omni_platform_plugin,
"xpu": xpu_omni_platform_plugin,
"musa": musa_omni_platform_plugin,
}
OmniPlatform ¶
Bases: Platform
Abstract base class for vllm-omni Platform.
Inherits from vLLM's Platform and adds Omni-specific interfaces. This gives OmniPlatform all vLLM Platform capabilities plus Omni-specific methods.
create_autocast_context classmethod ¶
get_diffusion_attn_backend_cls classmethod ¶
Get the diffusion attention backend class path for this platform.
This method selects the appropriate attention backend for diffusion models based on platform capabilities and user preferences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selected_backend | str | None | User-selected backend name (e.g., "FLASH_ATTN", "TORCH_SDPA", "SAGE_ATTN"). If None, uses platform default. | required |
head_size | int | Attention head size. | required |
Returns:
| Type | Description |
|---|---|
str | Fully qualified class path of the selected backend. |
get_diffusion_model_impl_qualname classmethod ¶
get_diffusion_model_runner_cls classmethod ¶
get_diffusion_model_runner_cls() -> str
Get the diffusion model runner class path for this platform.
Returns a fully qualified class path string. The class must be compatible with the DiffusionModelRunner interface.
get_diffusion_packed_modules_mapping classmethod ¶
get_diffusion_worker_cls classmethod ¶
get_diffusion_worker_cls() -> str
Get the diffusion worker class path for this platform.
Returns a fully qualified class path string that will be resolved and instantiated by WorkerWrapperBase. The class must be compatible with the DiffusionWorker interface.
get_graph_wrapper_cls classmethod ¶
get_graph_wrapper_cls() -> type
Return the platform's full-graph wrapper class.
Defaults to vLLM's CUDAGraphWrapper; NPU overrides with ACLGraphWrapper.
has_flash_attn_package classmethod ¶
has_flash_attn_package() -> bool
Check if a Flash Attention package is available and usable on this platform.
prepare_diffusion_op_runtime classmethod ¶
set_device_control_env_var classmethod ¶
set_forward_context classmethod ¶
set_forward_context(
attn_metadata: Any,
vllm_config: VllmConfig,
*,
cudagraph_runtime_mode: CUDAGraphMode,
batch_descriptor: BatchDescriptor,
)
Platform-neutral wrapper around the device's set_forward_context.
Defaults to vLLM's set_forward_context; NPU overrides to dispatch to set_ascend_forward_context (renaming cudagraph_runtime_mode to aclgraph_runtime_mode).
cuda_omni_platform_plugin ¶
cuda_omni_platform_plugin() -> str | None
Check if CUDA OmniPlatform should be activated.
musa_omni_platform_plugin ¶
musa_omni_platform_plugin() -> str | None
Check if MUSA OmniPlatform should be activated.
npu_omni_platform_plugin ¶
npu_omni_platform_plugin() -> str | None
Check if NPU OmniPlatform should be activated.
resolve_current_omni_platform_cls_qualname ¶
resolve_current_omni_platform_cls_qualname() -> str
Resolve the current OmniPlatform class qualified name.