使用 OpenCompass#
本文档将指导你如何使用 OpenCompass 进行准确率测试。
1. Online Serving#
你可以运行 docker 容器,在单个 NPU 上启动 vLLM 服务器:
# Update DEVICE according to your device (/dev/davinci[0-7])
export DEVICE=/dev/davinci7
# Update the vllm-ascend image
export IMAGE=quay.io/ascend/vllm-ascend:v0.9.1
docker run --rm \
--name vllm-ascend \
--device $DEVICE \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/.cache:/root/.cache \
-p 8000:8000 \
-e VLLM_USE_MODELSCOPE=True \
-e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
-it $IMAGE \
vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
如果你的服务启动成功,你会看到如下所示的信息:
INFO: Started server process [6873]
INFO: Waiting for application startup.
INFO: Application startup complete.
一旦你的服务器启动后,你可以在新的终端中用输入提示词查询模型:
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-7B-Instruct",
"prompt": "The future of AI is",
"max_tokens": 7,
"temperature": 0
}'
2. Run ceval accuracy test using OpenCompass#
在容器中安装 OpenCompass 并配置环境变量。
# Pin Python 3.10 due to:
# https://github.com/open-compass/opencompass/issues/1976
conda create -n opencompass python=3.10
conda activate opencompass
pip install opencompass modelscope[framework]
export DATASET_SOURCE=ModelScope
git clone https://github.com/open-compass/opencompass.git
添加 opencompass/configs/eval_vllm_ascend_demo.py,内容如下:
from mmengine.config import read_base
from opencompass.models import OpenAISDK
with read_base():
from opencompass.configs.datasets.ceval.ceval_gen import ceval_datasets
# Only test ceval-computer_network dataset in this demo
datasets = ceval_datasets[:1]
api_meta_template = dict(
round=[
dict(role='HUMAN', api_role='HUMAN'),
dict(role='BOT', api_role='BOT', generate=True),
],
reserved_roles=[dict(role='SYSTEM', api_role='SYSTEM')],
)
models = [
dict(
abbr='Qwen2.5-7B-Instruct-vLLM-API',
type=OpenAISDK,
key='EMPTY', # API key
openai_api_base='http://127.0.0.1:8000/v1',
path='Qwen/Qwen2.5-7B-Instruct',
tokenizer_path='Qwen/Qwen2.5-7B-Instruct',
rpm_verbose=True,
meta_template=api_meta_template,
query_per_second=1,
max_out_len=1024,
max_seq_len=4096,
temperature=0.01,
batch_size=8,
retry=3,
)
]
运行以下命令:
python3 run.py opencompass/configs/eval_vllm_ascend_demo.py --debug
1-2 分钟后,输出如下所示:
The markdown format results is as below:
| dataset | version | metric | mode | Qwen2.5-7B-Instruct-vLLM-API |
|----- | ----- | ----- | ----- | -----|
| ceval-computer_network | db9ce2 | accuracy | gen | 68.42 |
你可以在 OpenCompass 文档 查看更多用法。