Skip to main content
Back to top
Ctrl
+
K
Getting Started
Quickstart
Installation
Tutorials
Single NPU (Qwen3 8B)
Single NPU (Qwen2.5-VL 7B)
Single NPU (Qwen2-Audio 7B)
Single NPU (Qwen3-Embedding-8B)
Single-NPU (Qwen3 8B W4A8)
Multi-NPU (QwQ 32B)
Multi-NPU (Pangu Pro MoE)
Multi-NPU (Qwen3-30B-A3B)
Multi-NPU (QwQ 32B W8A8)
Single Node (Atlas 300I series)
Multi-Node-DP (DeepSeek)
Multi-Node-DP (Kimi-K2)
FAQs
User Guide
Features and models
Model Support
Feature Support
Configuration Guide
Environment Variables
Additional Configuration
Feature Guide
Graph Mode Guide
Quantization Guide
Sleep Mode Guide
Structured Output Guide
LoRA Adapters Guide
Release note
Developer Guide
Contributing
Testing
Feature Guide
Patch in vLLM Ascend
Accuracy
Using EvalScope
Using lm-eval
Using OpenCompass
Accuracy Report
Performance
Performance Benchmark
Profile Execute Duration
Optimization and Tuning
Modeling
Adding a New Model
Adding a New Multi-Modal Model
Community
Governance
Maintainers and contributors
Versioning policy
User Stories
LLaMA-Factory
Index