vllm_omni.profiler.omni_torch_profiler ¶
TorchProfilerActivity module-attribute ¶
TorchProfilerActivity = Literal[
"CPU", "CUDA", "XPU", "NPU", "MUSA"
]
TorchProfilerActivityMap module-attribute ¶
OmniTorchProfilerWrapper ¶
Bases: WorkerProfiler
Base torch profiler wrapper with platform-agnostic functionality.
Provides common profiler features: - Custom trace file naming with stage/rank info - Background gzip compression via subprocess - Returns trace file paths from get_results() for orchestrator collection
Subclasses can override hook methods for platform-specific behavior: - _get_default_activities(): Return default activities for the platform - _create_profiler(): Create platform-specific profiler instance - _on_stop_hook(): Handle platform-specific post-stop logic
dump_cpu_time_total instance-attribute ¶
dump_cpu_time_total = (
"CPU" in activities and len(activities) == 1
)
set_trace_filename ¶
set_trace_filename(filename: str) -> None
Set the trace filename before starting profiling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename | str | Base filename without extension or rank suffix. e.g. "stage_0_llm_1234567890" Can also be a full path (e.g. from diffusion engine). | required |
create_omni_profiler ¶
create_omni_profiler(
profiler_config: ProfilerConfig,
worker_name: str,
local_rank: int,
activities: list[TorchProfilerActivity] | None = None,
) -> OmniTorchProfilerWrapper
Factory function to create platform-specific profiler.
Uses the current platform's get_profiler_cls() to determine which profiler class to instantiate.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
profiler_config | ProfilerConfig | Profiler configuration. | required |
worker_name | str | Name of the worker. | required |
local_rank | int | Local rank of the worker. | required |
activities | list[TorchProfilerActivity] | None | Optional list of profiler activities. | None |
Returns:
| Type | Description |
|---|---|
OmniTorchProfilerWrapper | Platform-specific profiler instance. |