vllm.model_executor.warmup ¶
Modules:
-
deep_gemm_warmup–Warmup deep_gemm kernels.
-
deepseek_v4_mhc_warmup–Warm up DeepSeek V4 mHC TileLang kernels before serving requests.
-
flashinfer_autotune_cache–FlashInfer autotune cache helpers.
-
flashinfer_sparse_mla_warmup–Warmup and autotune helpers for FlashInfer sparse MLA backends.
-
kernel_warmup–Warmup kernels used during model execution.
-
qwen_triton_warmup–Warm up Qwen Triton kernels from the loaded model's compile keys.
-
sparse_mla_triton_warmup–Warm up sparse-MLA Triton metadata kernels.
-
v1_block_table_warmup–Warm up v1 block-table Triton kernels.