Nightly CI Test#
This document explains how to trigger nightly hardware CI tests against your own PR code on Ascend NPU hardware (A2/A3), without waiting for the scheduled nightly run.
Background#
By default, nightly CI tests run on a fixed schedule using pre-built nightly images. Contributors can self-service trigger these tests directly against their PR changes by combining a GitHub label with a comment command.
How to Trigger#
1. Post a comment#
First, post one of the following comments in the PR to specify which tests to run:
Comment |
Effect |
|---|---|
|
Run all nightly tests |
|
Run all nightly tests (same as above) |
|
Run only the named tests |
2. Add the label#
After posting the comment, add the nightly-test label to your PR.
Adding the label is what actually triggers the workflow — at that point the workflow
reads the existing comments to find the /nightly command.
Note
Only repository Contributors (Triage role) and Maintainers (Write role) can add labels. If you do not have this permission, ask a maintainer to add the label for you. You can find the list of maintainers and contributors in the project’s Governance page or by checking the CODEOWNERS file.
Important
The comment must be posted before the label is added. If you add the label first,
the workflow will find no /nightly comment and skip all tests.
3. Wait for results#
GitHub Actions will trigger the Nightly-A2 or Nightly-A3 workflow. Only tests
matching the filter will be dispatched, which saves hardware resources.
Differences Between PR and Scheduled Runs#
Scheduled / Manual Dispatch |
PR-triggered |
|
|---|---|---|
Trigger |
Cron (daily) or |
Label |
Code tested |
Pre-built nightly image |
Your PR’s HEAD commit (source installed fresh) |
Test scope |
All tests |
Configurable via |
vLLM + vllm-ascend |
From image |
Checked out and installed from source |
When a PR run is detected (is_pr_test: true), the workflow additionally:
Uninstalls any existing vllm packages in the container.
Checks out the specific vllm version and your PR’s vllm-ascend commit from source.
Installs all dependencies from source.
Installs the
aisbenchbenchmark suite.
Available Test Names#
The test names you can pass to /nightly correspond to the name fields in the
workflow matrix.
A2 workflow (.github/workflows/schedule_nightly_test_a2.yaml)#
Single-node tests:
Test name |
Description |
|---|---|
|
Custom operator tests (single card) |
|
Custom operator tests (multi card) |
|
Qwen3-32B model test |
|
Qwen3-Next-80B-A3B-Instruct model test |
|
Qwen3-32B INT8 quantization test |
|
Accuracy tests: Qwen3-VL-8B, Qwen3-8B, Qwen2-Audio-7B, etc. |
|
Accuracy tests: ERNIE-4.5, InternVL3_5-8B, Molmo-7B, Llama-3.2-3B, etc. |
|
Accuracy tests: Qwen3-30B-A3B, Qwen3-VL-30B-A3B, etc. |
|
Accuracy tests: Qwen3-Next-80B-A3B, Qwen3-Omni-30B-A3B, etc. |
Multi-node tests:
Test name |
Description |
|---|---|
|
DeepSeek-R1-W8A8, 2-node DP |
|
Qwen3-235B-A22B, 2-node DP |
Note
The doc-test job in the A2 workflow only runs on schedule or workflow_dispatch
events — it will not run on PR-triggered runs even with /nightly all.
A3 workflow (.github/workflows/schedule_nightly_test_a3.yaml)#
Multi-node tests (run first, single-node tests wait for these to complete):
Test name |
Description |
|---|---|
|
DeepSeek-V3, 2-node PD disaggregation |
|
Qwen3-235B-A22B, 2-node DP |
|
Qwen3-235B-W8A8, 2-node |
|
Qwen3-235B-W8A8 with EPLB, 2-node |
|
DeepSeek-V3.2-W8A8, 2-node |
|
Qwen3-235B-A22B with Mooncake layerwise, 2-node |
|
DeepSeek-R1-W8A8 long sequence, 2-node |
|
Qwen3-235B-W8A8 long sequence, 2-node |
|
DeepSeek-V3.2-W8A8 context parallel, 2-node |
|
Qwen3-235B disaggregated PD, 2-node |
|
Qwen3-VL-235B disaggregated PD, 2-node |
|
Kimi-K2-Instruct-W8A8, 2-node |
|
DeepSeek-V3.1-BF16, 2-node |
|
DeepSeek-V3.2-W8A8 with EP, 4-node |
Single-node tests (run after multi-node tests complete):
Test name |
Description |
|---|---|
|
Qwen3-30B accuracy test |
|
DeepSeek-R1-0528-W8A8 |
|
DeepSeek-R1-W8A8 HBM |
|
DeepSeek-V3.2-W8A8 |
|
GLM-5-W4A8 |
|
GLM-4.7-W8A8 |
|
Kimi-K2-Thinking |
|
Kimi-K2.5 |
|
MiniMax-M2.5 |
|
MTP-X + DeepSeek-R1-0528-W8A8 |
|
Qwen3-235B-A22B-W8A8 |
|
Qwen3-30B-A3B-W8A8 |
|
Qwen3-Next-80B-A3B-Instruct-W8A8 |
|
Qwen3-32B-Int8 |
|
Qwen3-32B-Int8 prefix cache |
|
DeepSeek-R1-0528-W8A8 prefix cache |
|
Custom multi-card operator tests |
Warning
The A3 resource pool has a maximum concurrency of 5×16 NPUs. Multi-node tests
run with max-parallel: 2 to avoid resource exhaustion. Running /nightly all on
A3 will queue a large number of jobs — prefer targeting specific test names when
possible.
Examples#
Run all available nightly tests against your PR:
/nightly
Run only the custom operator single-card test:
/nightly test_custom_op
Run two specific tests at once:
/nightly test_custom_op qwen3-32b
Re-trigger after fixing an issue: just push a new commit. The synchronize event
re-runs the workflow and picks up the existing /nightly comment automatically — no
need to post a new comment.
Troubleshooting#
The workflow didn’t start after I added the label.
Make sure the
/nightlycomment was posted before the label was added. If the label was added first, remove it and re-add it after posting the comment.Check that the comment starts exactly with
/nightlywith no leading spaces or extra characters before the slash.To re-trigger after fixing an issue, simply push a new commit — the workflow will reuse the existing
/nightlycomment automatically.
Only some tests ran, not the ones I expected.
Test names are case-sensitive and must match the
namefield in the workflow matrix exactly (see the table above).Check the
parse-triggerjob output in GitHub Actions for the resolvedtest_filtervalue.
The workflow ran with the scheduled image, not my PR code.
Confirm the workflow was triggered by a
pull_requestevent (label or push), notworkflow_dispatch.The
parse-triggerjob logs showis_pr_event— check its value.
How to obtain more detailed logs to pinpoint problems for multi-node tests
For most issues, the stdout pop-up logs from GitHub actions are sufficient (this log always represents the logs from the first node).
If the logs from a first node are no longer sufficient to provide effective logging information, see the summary of your jobs to download log archive for the corresponding test, which includes the framework-side logs and plog information for each node, structured as follows:
. ├── node0 │ ├── root │ │ └── ascend │ │ └── log │ └── var │ └── log │ └── vllm-deepseek-v3-0f233d-0_logs.txt └── node1 ├── root │ └── ascend │ └── log └── var └── log └── vllm-deepseek-v3-0f233d-0-1_logs.txt