测试#
本文档介绍如何编写单元测试、端到端(E2E)测试和夜间测试,以验证您所实现的功能。
搭建测试环境#
搭建测试环境最快捷的方法是使用主分支的容器镜像:
您可以按照以下步骤在 CPU 上运行单元测试:
cd ~/vllm-project/
# ls
# vllm vllm-ascend
# Use mirror to speed up download
# docker pull m.daocloud.io/quay.io/ascend/cann:9.0.0-910b-ubuntu22.04-py3.12
export IMAGE=quay.io/ascend/cann:9.0.0-910b-ubuntu22.04-py3.12
docker run --rm --name vllm-ascend-ut \
-v $(pwd):/vllm-project \
-v ~/.cache:/root/.cache \
-ti $IMAGE bash
# (Optional) Configure mirror to speed up download
sed -i 's|ports.ubuntu.com|mirrors.huaweicloud.com|g' /etc/apt/sources.list
pip config set global.index-url https://mirrors.huaweicloud.com/repository/pypi/simple/
# For torch-npu dev version or x86 machine
export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu/ https://mirrors.huaweicloud.com/ascend/repos/pypi"
# src path
export SRC_WORKSPACE=/vllm-workspace
mkdir -p $SRC_WORKSPACE
cd $SRC_WORKSPACE
apt-get update -y
apt-get install -y python3-pip git vim wget net-tools gcc g++ cmake libnuma-dev curl gnupg2
git clone -b v0.20.2rc1 --depth 1 https://github.com/vllm-project/vllm-ascend.git
git clone --depth 1 https://github.com/vllm-project/vllm.git
# vllm
cd $SRC_WORKSPACE/vllm
VLLM_TARGET_DEVICE=empty python3 -m pip install .
python3 -m pip uninstall -y triton
# vllm-ascend
cd $SRC_WORKSPACE/vllm-ascend
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/$(uname -m)-linux/devlib
# For cpu environment, set SOC_VERSION for different chips.
# See https://github.com/vllm-project/vllm-ascend/blob/3cb0af0bcf3299089ca7e72159fa36e825a470f8/setup.py#L132 for detail.
export SOC_VERSION="ascend910b1"
python3 -m pip install .
python3 -m pip install -r requirements-dev.txt
# Update DEVICE according to your device (/dev/davinci[0-7])
export DEVICE=/dev/davinci0
# Update the vllm-ascend image
export IMAGE=quay.io/ascend/vllm-ascend:main
docker run --rm \
--name vllm-ascend \
--shm-size=1g \
--device $DEVICE \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/.cache:/root/.cache \
-p 8000:8000 \
-it $IMAGE bash
启动容器后,您需要安装必要的依赖包:
# Prepare
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# Switch to the /vllm-workspace/vllm-ascend directory
cd /vllm-workspace/vllm-ascend/
# Install required packages
pip install -r requirements-dev.txt
# Update the vllm-ascend image
export IMAGE=quay.io/ascend/vllm-ascend:main
docker run --rm \
--name vllm-ascend \
--shm-size=1g \
--device /dev/davinci0 \
--device /dev/davinci1 \
--device /dev/davinci2 \
--device /dev/davinci3 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/.cache:/root/.cache \
-p 8000:8000 \
-it $IMAGE bash
启动容器后,您需要安装必要的依赖包:
cd /vllm-workspace/vllm-ascend/
# Prepare
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# Install required packages
pip install -r requirements-dev.txt
运行测试#
单元测试#
编写单元测试时需遵循以下原则:
测试文件路径应与源代码文件路径保持一致,并以
test_为前缀,例如:vllm_ascend/worker/worker.py→tests/ut/worker/test_worker.pyvLLM Ascend 测试使用 unittest 框架。请参阅 Python unittest 文档 了解如何编写单元测试。
所有单元测试都可在 CPU 上运行,因此必须在主机端模拟与设备相关的函数。
您可以使用
pytest运行单元测试:
# Run unit tests
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/$(uname -m)-linux/devlib
TORCH_DEVICE_BACKEND_AUTOLOAD=0 pytest -sv tests/ut
cd /vllm-workspace/vllm-ascend/
# Run all single-card tests
pytest -sv tests/ut
# Run single test
pytest -sv tests/ut/test_ascend_config.py
cd /vllm-workspace/vllm-ascend/
# Run all multi-card tests
pytest -sv tests/ut
# Run single test
pytest -sv tests/ut/test_ascend_config.py
端到端(E2E)测试#
虽然 vllm-ascend CI 在 Ascend CI 上提供了 E2E 测试(例如,schedule_nightly_test_a2.yaml、schedule_nightly_test_a3.yaml、pr_test_full.yaml),但您也可以在本地运行它们。
PR 触发的 E2E 测试#
您也可以使用 pytest 运行测试。典型示例如下:
注意:端到端测试无法在 CPU 上运行。
cd /vllm-workspace/vllm-ascend/
# Run all single-card tests
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/pull_request/full/one_card/
# Run a certain test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/pull_request/full/one_card/test_camem.py
# Run a certain case in test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/pull_request/full/one_card/test_camem.py::test_end_to_end
cd /vllm-workspace/vllm-ascend/
# Run all multi-card tests
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/pull_request/full/two_cards/
# Run a certain test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/pull_request/full/two_cards/test_qwen3_moe.py
# Run a certain case in test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/pull_request/full/two_cards/test_qwen3_moe.py::test_qwen3_moe_distributed_mp_tp2_ep
这将复现 E2E 测试的行为。
夜间触发的 E2E 测试#
您也可以使用 pytest 运行测试。典型示例如下:
注意:端到端测试无法在 CPU 上运行。
cd /vllm-workspace/vllm-ascend/
# run all single-card op tests
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/nightly/single_node/ops/singlecard_ops/
cd /vllm-workspace/vllm-ascend/
# run all multi-card op tests on A2
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/nightly/single_node/ops/multicard_ops_a2/
# run all multi-card op tests on A3
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/nightly/single_node/ops/multicard_ops_a3/
要在本地运行夜间单节点模型测试用例,请参考以下示例。
export CONFIG_YAML_PATH=Qwen3-32B.yaml
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/nightly/single_node/models/scripts/test_single_node.py
要在本地运行夜间多节点模型测试用例,请参阅多节点测试文档中的“本地运行”章节。
E2E 测试示例#
Offline test example:
tests/e2e/pull_request/full/one_card/test_camem.pyOnline test example:
tests/e2e/pull_request/full/two_cards/test_single_request_aclgraph.pyCorrectness test example:
tests/e2e/pull_request/full/one_card/test_aclgraph_accuracy.py
由于 CI 资源有限,您可能需要减少模型的层数。以下是一个如何生成精简层数模型的示例:
在 modelscope 上 fork 原始模型仓库。除权重文件外,需要保留仓库中的所有文件。
在配置中将
num_hidden_layers设置为期望的层数,例如:{"num_hidden_layers": 2,}将以下 Python 脚本保存为
generate_random_weight.py,并根据需要设置MODEL_LOCAL_PATH、DIST_DTYPE和DIST_MODEL_PATH参数:import torch from transformers import AutoTokenizer, AutoConfig from modeling_deepseek import DeepseekV3ForCausalLM from modelscope import snapshot_download MODEL_LOCAL_PATH = "~/.cache/modelscope/models/vllm-ascend/DeepSeek-V3-Pruning" DIST_DTYPE = torch.bfloat16 DIST_MODEL_PATH = "./random_deepseek_v3_with_2_hidden_layer" config = AutoConfig.from_pretrained(MODEL_LOCAL_PATH, trust_remote_code=True) model = DeepseekV3ForCausalLM(config) model = model.to(DIST_DTYPE) model.save_pretrained(DIST_MODEL_PATH)
运行文档测试(doctest)#
vllm-ascend 提供了 vllm-ascend/tests/e2e/run_doctests.sh 命令,用于运行所有文档文件中的 doctest。Doctest 是确保文档内容及时更新且示例代码保持可执行性的有效方法,您可以通过以下方式在本地运行:
# Run doctest
/vllm-workspace/vllm-ascend/tests/e2e/run_doctests.sh
这将复现与 CI 相同的测试环境。请参阅 labeled_doctest.yaml。
运行文档链接检查#
您可以通过以下方式在本地验证 Sphinx 文档中的外部链接:
make -C docs linkcheck SPHINXOPTS="-W --keep-going"
要检查特定 Markdown 文件中的链接,请将该文件传递给 sphinx-build。例如,仅检查 docs/source/user_guide/release_notes.md:
cd docs
sphinx-build -b linkcheck -W --keep-going \
source _build/linkcheck source/user_guide/release_notes.md
详细报告将写入以下位置:
docs/_build/linkcheck/output.txtdocs/_build/linkcheck/output.json