Model Support

Contents

Model Support#

Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/1608

Text-only Language Models#

Generative Models#

Model	Supported	Note
DeepSeek v3	✅
DeepSeek R1	✅
DeepSeek Distill (Qwen/LLama)	✅
Qwen3	✅
Qwen3-based	✅
Qwen3-Coder	✅
Qwen3-Moe	✅
Qwen2.5	✅
Qwen2	✅
Qwen2-based	✅
QwQ-32B	✅
LLama2/3/3.1	✅
Internlm	✅	#1962
Baichuan	✅
Baichuan2	✅
Phi-4-mini	✅
MiniCPM	✅
MiniCPM3	✅
Ernie4.5	✅
Ernie4.5-Moe	✅
Gemma-2	✅
Gemma-3	✅
Phi-3/4	✅
Mistral/Mistral-Instruct	✅
GLM-4.5	✅
GLM-4	❌	#2255
GLM-4-0414	❌	#2258
ChatGLM	❌	#554
DeepSeek v2.5	🟡	Need test
Mllama	🟡	Need test
MiniMax-Text	🟡	Need test

Pooling Models#

Model	Supported	Note
Qwen3-Embedding	✅
Molmo	✅	1942
XLM-RoBERTa-based	❌	1960

Multimodal Language Models#

Generative Models#

Model	Supported	Note
Qwen2-VL	✅
Qwen2.5-VL	✅
Qwen2.5-Omni	✅	1760
QVQ	✅
LLaVA 1.5/1.6	✅	1962
InternVL2	✅
InternVL2.5	✅
Qwen2-Audio	✅
Aria	✅
LLaVA-Next	✅
LLaVA-Next-Video	✅
MiniCPM-V	✅
Mistral3	✅
Phi-3-Vison/Phi-3.5-Vison	✅
Gemma3	✅
LLama4	❌	1972
LLama3.2	❌	1972
Keye-VL-8B-Preview	❌	1963
Florence-2	❌	2259
GLM-4V	❌	2260
InternVL2.0/2.5/3.0 InternVideo2.5/Mono-InternVL	❌	2064
Whisper	❌	2262
Ultravox	🟡 Need test