Qwen/Qwen3-8B-Base#
vLLM Version: vLLM: 0.10.1.1 (1da94e6), vLLM Ascend Version: v0.10.1rc1 (7e16b4a)
Software Environment: CANN: 8.2.RC1, PyTorch: 2.7.1, torch-npu: 2.7.1.dev20250724
Hardware Environment: Atlas A2 Series
Parallel mode: TP1
Execution mode: ACLGraph
Command:
export MODEL_ARGS='pretrained=Qwen/Qwen3-8B-Base,tensor_parallel_size=1,dtype=auto,trust_remote_code=False,max_model_len=4096'
lm_eval --model vllm --model_args $MODEL_ARGS --tasks gsm8k,ceval-valid \
--apply_chat_template True --fewshot_as_multiturn True --num_fewshot 5 --batch_size auto
Task |
Metric |
Value |
Stderr |
|---|---|---|---|
gsm8k |
exact_match,strict-match |
✅0.8271 |
± 0.0104 |
gsm8k |
exact_match,flexible-extract |
✅0.8294 |
± 0.0104 |
ceval-valid |
acc,none |
✅0.815 |
± 0.0103 |