CI Failures¶
What should I do when a CI job fails on my PR, but I don't think my PR caused the failure?
-
Check the dashboard of current CI test failures:
๐ CI Failures Dashboard -
If your failure is already listed, it's likely unrelated to your PR.
Help fixing it is always welcome!- Leave comments with links to additional instances of the failure.
- React with a ๐ to signal how many are affected.
-
If your failure is not listed, you should file an issue.
Filing a CI Test Failure Issue¶
-
File a bug report:
๐ New CI Failure Report -
Use this title format:
-
For the environment field:
Still failing on main as of commit abcdef123
-
In the description, include failing tests:
FAILED failing/test.py:failing_test1 - Failure description FAILED failing/test.py:failing_test2 - Failure description https://github.com/orgs/vllm-project/projects/20 https://github.com/vllm-project/vllm/issues/new?template=400-bug-report.yml FAILED failing/test.py:failing_test3 - Failure description
-
Attach logs (collapsible section example):
Logs:
ERROR 05-20 03:26:38 [dump_input.py:68] Dumping input data --- Logging error --- Traceback (most recent call last): File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 203, in execute_model return self.model_executor.execute_model(scheduler_output) ... FAILED failing/test.py:failing_test1 - Failure description FAILED failing/test.py:failing_test2 - Failure description FAILED failing/test.py:failing_test3 - Failure description
Logs Wrangling¶
Download the full log file from Buildkite locally.
Strip timestamps and colorization:
.buildkite/scripts/ci-clean-log.sh
Use a tool wl-clipboard for quick copy-pasting:
Investigating a CI Test Failure¶
- Go to ๐ Buildkite main branch
- Bisect to find the first build that shows the issue.
- Add your findings to the GitHub issue.
- If you find a strong candidate PR, mention it in the issue and ping contributors.
Reproducing a Failure¶
CI test failures may be flaky. Use a bash loop to run repeatedly:
.buildkite/scripts/rerun-test.sh
Submitting a PR¶
If you submit a PR to fix a CI failure:
- Link the PR to the issue:
AddCloses #12345
to the PR description. - Add the
ci-failure
label:
This helps track it in the CI Failures GitHub Project.
Other Resources¶
Daily Triage¶
Use Buildkite analytics (2-day view) to:
- Identify recent test failures on
main
. - Exclude legitimate test failures on PRs.
- (Optional) Ignore tests with 0% reliability.
Compare to the CI Failures Dashboard.