vllm_omni.platforms ¶

Modules:

Name	Description
`cuda`
`interface`
`musa`
`npu`
`rocm`
`xpu`

builtin_omni_platform_plugins `module-attribute` ¶

builtin_omni_platform_plugins = {
    "cuda": cuda_omni_platform_plugin,
    "rocm": rocm_omni_platform_plugin,
    "npu": npu_omni_platform_plugin,
    "xpu": xpu_omni_platform_plugin,
    "musa": musa_omni_platform_plugin,
}

current_omni_platform `module-attribute` ¶

current_omni_platform: OmniPlatform

logger `module-attribute` ¶

logger = logging.getLogger(__name__)

OmniPlatform ¶

Bases: Platform

Abstract base class for vllm-omni Platform.

Inherits from vLLM's Platform and adds Omni-specific interfaces. This gives OmniPlatform all vLLM Platform capabilities plus Omni-specific methods.

create_autocast_context `classmethod` ¶

create_autocast_context(
    *, device_type: str, dtype: dtype, enabled: bool = True
)

get_default_stage_config_path `classmethod` ¶

get_default_stage_config_path() -> str

get_device_count `classmethod` ¶

get_device_count() -> int

get_device_memory `classmethod` ¶

get_device_memory(
    device: device | None = None,
) -> tuple[int, int]

get_device_version `classmethod` ¶

get_device_version() -> str | None

get_diffusion_attn_backend_cls `classmethod` ¶

get_diffusion_attn_backend_cls(
    selected_backend: str | None, head_size: int
) -> str

Get the diffusion attention backend class path for this platform.

This method selects the appropriate attention backend for diffusion models based on platform capabilities and user preferences.

Parameters:

Name	Type	Description	Default
`selected_backend`	`str \| None`	User-selected backend name (e.g., "FLASH_ATTN", "TORCH_SDPA", "SAGE_ATTN"). If None, uses platform default.	required
`head_size`	`int`	Attention head size.	required

Returns:

Type	Description
`str`	Fully qualified class path of the selected backend.

get_diffusion_model_impl_qualname `classmethod` ¶

get_diffusion_model_impl_qualname(op_name: str) -> str

get_diffusion_model_runner_cls `classmethod` ¶

get_diffusion_model_runner_cls() -> str

Get the diffusion model runner class path for this platform.

Returns a fully qualified class path string. The class must be compatible with the DiffusionModelRunner interface.

get_diffusion_packed_modules_mapping `classmethod` ¶

get_diffusion_packed_modules_mapping(
    model_class: type[Module],
) -> dict[str, list[str]] | None

get_diffusion_worker_cls `classmethod` ¶

get_diffusion_worker_cls() -> str

Get the diffusion worker class path for this platform.

Returns a fully qualified class path string that will be resolved and instantiated by WorkerWrapperBase. The class must be compatible with the DiffusionWorker interface.

get_free_memory `classmethod` ¶

get_free_memory(device: device | None = None) -> int

get_graph_wrapper_cls `classmethod` ¶

get_graph_wrapper_cls() -> type

Return the platform's full-graph wrapper class.

Defaults to vLLM's CUDAGraphWrapper; NPU overrides with ACLGraphWrapper.

get_omni_ar_worker_cls `classmethod` ¶

get_omni_ar_worker_cls() -> str

get_omni_generation_worker_cls `classmethod` ¶

get_omni_generation_worker_cls() -> str

get_profiler_cls `classmethod` ¶

get_profiler_cls() -> str

Get the profiler class for this platform.

Returns:

Type	Description
`str`	Fully qualified class path of the profiler.
`str`	Default returns the base OmniTorchProfilerWrapper.

get_torch_device `classmethod` ¶

get_torch_device(local_rank: int | None = None) -> device

has_flash_attn_package `classmethod` ¶

has_flash_attn_package() -> bool

Check if a Flash Attention package is available and usable on this platform.

init_diffusion_worker_vllm_config `classmethod` ¶

init_diffusion_worker_vllm_config(vllm_config: Any) -> None

Initialize platform-specific state for diffusion worker VllmConfig.

is_cuda ¶

is_cuda() -> bool

is_musa ¶

is_musa() -> bool

is_npu ¶

is_npu() -> bool

is_out_of_tree ¶

is_out_of_tree() -> bool

is_rocm ¶

is_rocm() -> bool

is_xpu ¶

is_xpu() -> bool

prepare_diffusion_op_runtime `classmethod` ¶

prepare_diffusion_op_runtime(
    op_name: str, **kwargs: Any
) -> None

set_device_control_env_var `classmethod` ¶

set_device_control_env_var(
    devices: str | int | None,
) -> None

set_forward_context `classmethod` ¶

set_forward_context(
    attn_metadata: Any,
    vllm_config: VllmConfig,
    *,
    cudagraph_runtime_mode: CUDAGraphMode,
    batch_descriptor: BatchDescriptor,
)

Platform-neutral wrapper around the device's set_forward_context.

Defaults to vLLM's set_forward_context; NPU overrides to dispatch to set_ascend_forward_context (renaming cudagraph_runtime_mode to aclgraph_runtime_mode).

supports_cpu_offload `classmethod` ¶

supports_cpu_offload() -> bool

supports_float64 `classmethod` ¶

supports_float64() -> bool

supports_torch_inductor `classmethod` ¶

supports_torch_inductor() -> bool

Check if the platform supports torch.compile with inductor backend.

synchronize `classmethod` ¶

synchronize() -> None

unset_device_control_env_var `classmethod` ¶

unset_device_control_env_var() -> None

OmniPlatformEnum ¶

Bases: Enum

Enum for supported Omni platforms.

CUDA `class-attribute` `instance-attribute` ¶

CUDA = 'cuda'

MUSA `class-attribute` `instance-attribute` ¶

MUSA = 'musa'

NPU `class-attribute` `instance-attribute` ¶

NPU = 'npu'

OOT `class-attribute` `instance-attribute` ¶

OOT = 'oot'

ROCM `class-attribute` `instance-attribute` ¶

ROCM = 'rocm'

UNSPECIFIED `class-attribute` `instance-attribute` ¶

UNSPECIFIED = 'unspecified'

XPU `class-attribute` `instance-attribute` ¶

XPU = 'xpu'

cuda_omni_platform_plugin ¶

cuda_omni_platform_plugin() -> str | None

Check if CUDA OmniPlatform should be activated.

musa_omni_platform_plugin ¶

musa_omni_platform_plugin() -> str | None

Check if MUSA OmniPlatform should be activated.

npu_omni_platform_plugin ¶

npu_omni_platform_plugin() -> str | None

Check if NPU OmniPlatform should be activated.

resolve_current_omni_platform_cls_qualname ¶

resolve_current_omni_platform_cls_qualname() -> str

Resolve the current OmniPlatform class qualified name.

rocm_omni_platform_plugin ¶

rocm_omni_platform_plugin() -> str | None

Check if ROCm OmniPlatform should be activated.

xpu_omni_platform_plugin ¶

xpu_omni_platform_plugin() -> str | None

Check if XPU OmniPlatform should be activated.

vllm_omni.platforms ¶

builtin_omni_platform_plugins module-attribute ¶

current_omni_platform module-attribute ¶

logger module-attribute ¶

OmniPlatform ¶

create_autocast_context classmethod ¶

get_default_stage_config_path classmethod ¶

get_device_count classmethod ¶

get_device_memory classmethod ¶

get_device_version classmethod ¶

get_diffusion_attn_backend_cls classmethod ¶

get_diffusion_model_impl_qualname classmethod ¶

get_diffusion_model_runner_cls classmethod ¶

get_diffusion_packed_modules_mapping classmethod ¶

get_diffusion_worker_cls classmethod ¶

get_free_memory classmethod ¶

get_graph_wrapper_cls classmethod ¶

get_omni_ar_worker_cls classmethod ¶

get_omni_generation_worker_cls classmethod ¶

get_profiler_cls classmethod ¶

get_torch_device classmethod ¶

has_flash_attn_package classmethod ¶

init_diffusion_worker_vllm_config classmethod ¶