Nightly CI Test

Nightly CI Test#

This document explains how to trigger nightly hardware CI tests against your own PR code on Ascend NPU hardware (A2/A3), without waiting for the scheduled nightly run.

Background#

By default, nightly CI tests run on a fixed schedule using pre-built nightly images. Contributors can self-service trigger these tests directly against their PR changes by combining a GitHub label with a comment command.

How to Trigger#

1. Post a comment#

First, post one of the following comments in the PR to specify which tests to run:

Comment	Effect
`/nightly`	Run all nightly tests
`/nightly all`	Run all nightly tests (same as above)
`/nightly test1 test2 ...`	Run only the named tests

2. Add the label#

After posting the comment, add the nightly-test label to your PR. Adding the label is what actually triggers the workflow — at that point the workflow reads the existing comments to find the /nightly command.

Note

Only repository Contributors (Triage role) and Maintainers (Write role) can add labels. If you do not have this permission, ask a maintainer to add the label for you. You can find the list of maintainers and contributors in the project’s Governance page or by checking the CODEOWNERS file.

Important

The comment must be posted before the label is added. If you add the label first, the workflow will find no /nightly comment and skip all tests.

3. Wait for results#

GitHub Actions will trigger the Nightly-A2 or Nightly-A3 workflow. Only tests matching the filter will be dispatched, which saves hardware resources.

Differences Between PR and Scheduled Runs#

	Scheduled / Manual Dispatch	PR-triggered
Trigger	Cron (daily) or `workflow_dispatch`	Label `nightly-test` + `/nightly` comment
Code tested	Pre-built nightly image	Your PR’s HEAD commit (source installed fresh)
Test scope	All tests	Configurable via `/nightly <names>`
vLLM + vllm-ascend	From image	Checked out and installed from source

When a PR run is detected (is_pr_test: true), the workflow additionally:

Uninstalls any existing vllm packages in the container.
Checks out the specific vllm version and your PR’s vllm-ascend commit from source.
Installs all dependencies from source.
Installs the aisbench benchmark suite.

Available Test Names#

The test names you can pass to /nightly correspond to the name fields in the workflow matrix.

A2 workflow (`.github/workflows/schedule_nightly_test_a2.yaml`)#

Single-node tests:

Test name	Description
`test_custom_op`	Custom operator tests (single card)
`test_custom_op_multi_card`	Custom operator tests (multi card)
`qwen3-32b`	Qwen3-32B model test
`qwen3-next-80b-a3b-instruct`	Qwen3-Next-80B-A3B-Instruct model test
`qwen3-32b-int8`	Qwen3-32B INT8 quantization test
`accuracy-group-1`	Accuracy tests: Qwen3-VL-8B, Qwen3-8B, Qwen2-Audio-7B, etc.
`accuracy-group-2`	Accuracy tests: ERNIE-4.5, InternVL3_5-8B, Molmo-7B, Llama-3.2-3B, etc.
`accuracy-group-3`	Accuracy tests: Qwen3-30B-A3B, Qwen3-VL-30B-A3B, etc.
`accuracy-group-4`	Accuracy tests: Qwen3-Next-80B-A3B, Qwen3-Omni-30B-A3B, etc.

Multi-node tests:

Test name	Description
`multi-node-deepseek-dp`	DeepSeek-R1-W8A8, 2-node DP
`multi-node-qwen3-235b-dp`	Qwen3-235B-A22B, 2-node DP

Note

The doc-test job in the A2 workflow only runs on schedule or workflow_dispatch events — it will not run on PR-triggered runs even with /nightly all.

A3 workflow (`.github/workflows/schedule_nightly_test_a3.yaml`)#

Multi-node tests (run first, single-node tests wait for these to complete):

Test name	Description
`multi-node-deepseek-pd`	DeepSeek-V3, 2-node PD disaggregation
`multi-node-qwen3-dp`	Qwen3-235B-A22B, 2-node DP
`multi-node-qwenw8a8-2node`	Qwen3-235B-W8A8, 2-node
`multi-node-qwenw8a8-2node-eplb`	Qwen3-235B-W8A8 with EPLB, 2-node
`multi-node-dpsk3.2-2node`	DeepSeek-V3.2-W8A8, 2-node
`multi-node-qwen3-dp-mooncake-layerwise`	Qwen3-235B-A22B with Mooncake layerwise, 2-node
`multi-node-deepseek-r1-w8a8-longseq`	DeepSeek-R1-W8A8 long sequence, 2-node
`multi-node-qwenw8a8-2node-longseq`	Qwen3-235B-W8A8 long sequence, 2-node
`multi-node-deepseek-V3_2-W8A8-cp`	DeepSeek-V3.2-W8A8 context parallel, 2-node
`multi-node-qwen-disagg-pd`	Qwen3-235B disaggregated PD, 2-node
`multi-node-qwen-vl-disagg-pd`	Qwen3-VL-235B disaggregated PD, 2-node
`multi-node-kimi-k2-instruct-w8a8`	Kimi-K2-Instruct-W8A8, 2-node
`multi-node-deepseek-v3.1`	DeepSeek-V3.1-BF16, 2-node
`multi-node-deepseek-v3.2-W8A8-EP`	DeepSeek-V3.2-W8A8 with EP, 4-node

Single-node tests (run after multi-node tests complete):

Test name	Description
`qwen3-30b-acc`	Qwen3-30B accuracy test
`deepseek-r1-0528-w8a8`	DeepSeek-R1-0528-W8A8
`deepseek-r1-w8a8-hbm`	DeepSeek-R1-W8A8 HBM
`deepseek-v3-2-w8a8`	DeepSeek-V3.2-W8A8
`glm-5-w4a8`	GLM-5-W4A8
`glm-4.7-w8a8`	GLM-4.7-W8A8
`kimi-k2-thinking`	Kimi-K2-Thinking
`kimi-k2.5`	Kimi-K2.5
`minimax-m2-5`	MiniMax-M2.5
`mtpx-deepseek-r1-0528-w8a8`	MTP-X + DeepSeek-R1-0528-W8A8
`qwen3-235b-a22b-w8a8`	Qwen3-235B-A22B-W8A8
`qwen3-30b-a3b-w8a8`	Qwen3-30B-A3B-W8A8
`qwen3-next-80b-a3b-instruct-w8a8`	Qwen3-Next-80B-A3B-Instruct-W8A8
`qwen3-32b-int8`	Qwen3-32B-Int8
`qwen3-32b-int8-prefix-cache`	Qwen3-32B-Int8 prefix cache
`deepseek-r1-0528-w8a8-prefix-cache`	DeepSeek-R1-0528-W8A8 prefix cache
`custom-multi-ops`	Custom multi-card operator tests

Warning

The A3 resource pool has a maximum concurrency of 5×16 NPUs. Multi-node tests run with max-parallel: 2 to avoid resource exhaustion. Running /nightly all on A3 will queue a large number of jobs — prefer targeting specific test names when possible.

Examples#

Run all available nightly tests against your PR:

/nightly

Run only the custom operator single-card test:

/nightly test_custom_op

Run two specific tests at once:

/nightly test_custom_op qwen3-32b

Re-trigger after fixing an issue: just push a new commit. The synchronize event re-runs the workflow and picks up the existing /nightly comment automatically — no need to post a new comment.

Troubleshooting#

The workflow didn’t start after I added the label.

Make sure the /nightly comment was posted before the label was added. If the label was added first, remove it and re-add it after posting the comment.
Check that the comment starts exactly with /nightly with no leading spaces or extra characters before the slash.
To re-trigger after fixing an issue, simply push a new commit — the workflow will reuse the existing /nightly comment automatically.

Only some tests ran, not the ones I expected.

Test names are case-sensitive and must match the name field in the workflow matrix exactly (see the table above).
Check the parse-trigger job output in GitHub Actions for the resolved test_filter value.

The workflow ran with the scheduled image, not my PR code.

Confirm the workflow was triggered by a pull_request event (label or push), not workflow_dispatch.
The parse-trigger job logs show is_pr_event — check its value.

How to obtain more detailed logs to pinpoint problems for multi-node tests

For most issues, the stdout pop-up logs from GitHub actions are sufficient (this log always represents the logs from the first node).

If the logs from a first node are no longer sufficient to provide effective logging information, see the summary of your jobs to download log archive for the corresponding test, which includes the framework-side logs and plog information for each node, structured as follows:

.
├── node0
│   ├── root
│   │   └── ascend
│   │       └── log
│   └── var
│       └── log
│           └── vllm-deepseek-v3-0f233d-0_logs.txt
└── node1
    ├── root
    │   └── ascend
    │       └── log
    └── var
        └── log
            └── vllm-deepseek-v3-0f233d-0-1_logs.txt