vllm.models.deepseek_v4.xpu ¶
Modules:
-
model– -
mtp–MTP draft model for DeepSeek V4 (internal codename: DeepseekV4).
-
xpu_qnorm_rope_kv_fp8_insert–XPU Triton replacement for fused_deepseek_v4_qnorm_rope_kv_rope_quant_insert.
-
xpu_sparse–XPU DeepSeek-V4 attention subclass.
-
xpu_sparse_decode_fp8–XPU sparse decode for DeepSeek V4 with FP8 KV cache.