测试#

本节介绍如何编写端到端测试和单元测试,以验证你的功能实现。

设置测试环境#

搭建测试环境最快的方法是使用 main 分支的容器镜像:

你可以按照以下步骤在 CPU 上运行单元测试:

cd ~/vllm-project/
# ls
# vllm  vllm-ascend

# Use mirror to speedup download
# docker pull quay.nju.edu.cn/ascend/cann:8.2.rc1-910b-ubuntu22.04-py3.11
export IMAGE=quay.io/ascend/cann:8.2.rc1-910b-ubuntu22.04-py3.11
docker run --rm --name vllm-ascend-ut \
    -v $(pwd):/vllm-project \
    -v ~/.cache:/root/.cache \
    -ti $IMAGE bash

# (Optional) Configure mirror to speedup download
sed -i 's|ports.ubuntu.com|mirrors.huaweicloud.com|g' /etc/apt/sources.list
pip config set global.index-url https://mirrors.huaweicloud.com/repository/pypi/simple/

# For torch-npu dev version or x86 machine
export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu/ https://mirrors.huaweicloud.com/ascend/repos/pypi"

apt-get update -y
apt-get install -y python3-pip git vim wget net-tools gcc g++ cmake libnuma-dev curl gnupg2

# Install vllm
cd /vllm-project/vllm
VLLM_TARGET_DEVICE=empty python3 -m pip -v install .

# Install vllm-ascend
cd /vllm-project/vllm-ascend
# [IMPORTANT] Import LD_LIBRARY_PATH to enumerate the CANN environment under CPU
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/$(uname -m)-linux/devlib
python3 -m pip install -r requirements-dev.txt
python3 -m pip install -v .
# Update DEVICE according to your device (/dev/davinci[0-7])
export DEVICE=/dev/davinci0
# Update the vllm-ascend image
export IMAGE=quay.io/ascend/vllm-ascend:main
docker run --rm \
    --name vllm-ascend \
    --device $DEVICE \
    --device /dev/davinci_manager \
    --device /dev/devmm_svm \
    --device /dev/hisi_hdc \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
    -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
    -v /etc/ascend_install.info:/etc/ascend_install.info \
    -v /root/.cache:/root/.cache \
    -p 8000:8000 \
    -it $IMAGE bash

启动容器后,你应该安装所需的软件包:

# Prepare
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

# Install required packages
pip install -r requirements-dev.txt
# Update the vllm-ascend image
export IMAGE=quay.io/ascend/vllm-ascend:main
docker run --rm \
    --name vllm-ascend \
    --device /dev/davinci0 \
    --device /dev/davinci1 \
    --device /dev/davinci2 \
    --device /dev/davinci3 \
    --device /dev/davinci_manager \
    --device /dev/devmm_svm \
    --device /dev/hisi_hdc \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
    -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
    -v /etc/ascend_install.info:/etc/ascend_install.info \
    -v /root/.cache:/root/.cache \
    -p 8000:8000 \
    -it $IMAGE bash

启动容器后,你应该安装所需的软件包:

cd /vllm-workspace/vllm-ascend/

# Prepare
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

# Install required packages
pip install -r requirements-dev.txt

运行测试#

单元测试#

编写单元测试时需要遵循几个原则:

  • 测试文件的路径应与源文件保持一致,并以 test_ 前缀开头,例如:vllm_ascend/worker/worker_v1.py --> tests/ut/worker/test_worker_v1.py

  • vLLM Ascend 测试使用 unittest 框架,参见这里了解如何编写单元测试。

  • 所有单元测试都可以在 CPU 上运行,因此你必须将与设备相关的函数模拟为 host。

  • 示例:tests/ut/test_ascend_config.py

  • 你可以使用 pytest 运行单元测试:

# Run unit tests
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/$(uname -m)-linux/devlib
TORCH_DEVICE_BACKEND_AUTOLOAD=0 pytest -sv tests/ut
cd /vllm-workspace/vllm-ascend/
# Run all single card the tests
pytest -sv tests/ut

# Run single test
pytest -sv tests/ut/test_ascend_config.py
cd /vllm-workspace/vllm-ascend/
# Run all single card the tests
pytest -sv tests/ut

# Run single test
pytest -sv tests/ut/test_ascend_config.py

端到端测试#

虽然 vllm-ascend CI 在 Ascend CI 上提供了 端到端测试,你也可以在本地运行它。

你无法在 CPU 上运行 e2e 测试。

cd /vllm-workspace/vllm-ascend/
# Run all single card the tests
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/singlecard/

# Run a certain test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/singlecard/test_offline_inference.py

# Run a certain case in test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/singlecard/test_offline_inference.py::test_models
cd /vllm-workspace/vllm-ascend/
# Run all single card the tests
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/multicard/

# Run a certain test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/multicard/test_dynamic_npugraph_batchsize.py

# Run a certain case in test script
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/multicard/test_offline_inference.py::test_models

这将复现端到端测试:vllm_ascend_test.yaml

E2E 测试示例:#

  • 离线测试示例:tests/e2e/singlecard/test_offline_inference.py

  • 在线测试示例:tests/e2e/singlecard/test_prompt_embedding.py

  • 正确性测试示例:tests/e2e/singlecard/test_aclgraph.py

  • 简化层模型测试示例:test_torchair_graph_mode.py - DeepSeek-V3-Pruning

    CI 资源有限,您可能需要减少模型的层数,下面是一个生成减少层数模型的示例:

    1. 在 modelscope 中 fork 原始模型仓库,我们需要仓库中的所有文件,除了权重文件。

    2. num_hidden_layers 设置为期望的层数,例如 {"num_hidden_layers": 2,}

    3. 将以下 Python 脚本复制为 generate_random_weight.py。根据需要设置相关参数 MODEL_LOCAL_PATHDIST_DTYPEDIST_MODEL_PATH

      import torch
      from transformers import AutoTokenizer, AutoConfig
      from modeling_deepseek import DeepseekV3ForCausalLM
      from modelscope import snapshot_download
      
      MODEL_LOCAL_PATH = "~/.cache/modelscope/models/vllm-ascend/DeepSeek-V3-Pruning"
      DIST_DTYPE = torch.bfloat16
      DIST_MODEL_PATH = "./random_deepseek_v3_with_2_hidden_layer"
      
      config = AutoConfig.from_pretrained(MODEL_LOCAL_PATH, trust_remote_code=True)
      model = DeepseekV3ForCausalLM(config)
      model = model.to(DIST_DTYPE)
      model.save_pretrained(DIST_MODEL_PATH)
      

运行 doctest#

vllm-ascend 提供了一个 vllm-ascend/tests/e2e/run_doctests.sh 命令,用于运行文档文件中的所有 doctest。doctest 是确保文档保持最新且示例可执行的好方法,你可以按照以下方式在本地运行它:

# Run doctest
/vllm-workspace/vllm-ascend/tests/e2e/run_doctests.sh

这将复现与 CI 相同的环境:vllm_ascend_doctest.yaml