Documentation writing guide#
Guide to Writing Model Tutorial Documentation#
docs/source/_templates/Model-Deployment-Tutorial-Template.md is a template for writing model deployment tutorials. You can copy and modify it to create new docs.
Testable documentation code block generation (model-code)#
For documentation authors: how to insert testable command blocks into docs
For developers: how to add a new converter
Built-in supported converter_tag values:
converter_tag |
Renders |
YAML source |
|---|---|---|
|
One |
|
|
One host's |
|
|
One external-DP node's env exports + |
|
|
One |
|
|
The load-balance proxy launch command |
|
Local debugging and generation#
Generate only (without building the full site)#
# Generate all model-code artifacts under docs/source/tutorials/models/
python3 tools/docs_codegen/cli.py
# Generate artifacts for a single document
python3 tools/docs_codegen/cli.py --doc docs/source/tutorials/models/Kimi-K2-Thinking.md
# Generate a single block and print it (no files written)
python3 tools/docs_codegen/cli.py \
--block docs/source/tutorials/models/Kimi-K2-Thinking.md::kimi_k2_thinking_single_node \
--dry-run --stdout
By default, artifacts are written to: docs/_build/doc_codegen/<doc_stem>/<block_name>.sh.
备注
After the script is generated, please make sure to check whether the generated content is runnable, especially key parts such as environment variables and command-line parameters.
Concrete YAML-to-shell example#
The following model-code block reads the first test case from
tests/e2e/nightly/single_node/models/configs/Kimi-K2-Thinking.yaml:
```{model-code}
:block_name: kimi_k2_thinking_single_node
:converter_tag: single_node
:test_case_path: tests/e2e/nightly/single_node/models/configs/Kimi-K2-Thinking.yaml
```
The YAML fields read by the converter look like this:
test_cases:
- name: "Kimi-K2-Thinking-TP16-Case"
model: "moonshotai/Kimi-K2-Thinking"
envs:
HCCL_BUFFSIZE: "1024"
TASK_QUEUE_ENABLE: "1"
OMP_PROC_BIND: "false"
HCCL_OP_EXPANSION_MODE: "AIV"
PYTORCH_NPU_ALLOC_CONF: "expandable_segments:True"
SERVER_PORT: "DEFAULT_PORT"
server_cmd:
- "--tensor-parallel-size"
- "16"
- "--port"
- "$SERVER_PORT"
- "--max-model-len"
- "8192"
- "--max-num-batched-tokens"
- "8192"
- "--max-num-seqs"
- "12"
- "--gpu-memory-utilization"
- "0.9"
- "--trust-remote-code"
- "--enable-expert-parallel"
- "--no-enable-prefix-caching"
Run the block in dry-run mode to see the generated shell without writing files:
python3 tools/docs_codegen/cli.py \
--block docs/source/tutorials/models/Kimi-K2-Thinking.md::kimi_k2_thinking_single_node \
--dry-run --stdout
The first line is the artifact path. The remaining lines are the generated shell content:
# docs/_build/doc_codegen/Kimi-K2-Thinking/kimi_k2_thinking_single_node.sh
export HCCL_BUFFSIZE=1024
export TASK_QUEUE_ENABLE=1
export OMP_PROC_BIND=false
export HCCL_OP_EXPANSION_MODE=AIV
export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
export SERVER_PORT=8000
vllm serve moonshotai/Kimi-K2-Thinking \
--tensor-parallel-size 16 \
--port $SERVER_PORT \
--max-model-len 8192 \
--max-num-batched-tokens 8192 \
--max-num-seqs 12 \
--gpu-memory-utilization 0.9 \
--trust-remote-code \
--enable-expert-parallel \
--no-enable-prefix-caching
In this example, envs is rendered as export lines, model becomes
vllm serve <model>, and server_cmd is appended as formatted command-line
arguments. SERVER_PORT: "DEFAULT_PORT" is resolved to the default single-node
port 8000.
Build the site & preview locally#
# Install documentation build dependencies
python3 -m pip install -r docs/requirements-docs.txt
# (Optional) Clean previous builds
make -C docs clean
# Build the English site
make -C docs html
# (Optional) Build the Chinese site
make -C docs intl
# Preview locally
python3 -m http.server -d docs/_build/html 8000
# Then open in a browser:
# http://localhost:8000
For developers: add a new converter#
The goal of adding a converter is to make converter_tag: <name> render a given YAML structure into a script (GeneratedScript).
What to change#
In
tools/docs_codegen/converters.py:Add a
BaseConvertersubclass that implementsconvert(loaded_yaml, *, block) -> GeneratedScriptGive the converter a unique
name(the value used byconverter_tagin docs)Register it in
build_default_converters()Reuse the shared validation/rendering helpers in
tools/docs_codegen/utils.py(require_yaml_mapping,require_mapping,require_scalar_mapping,require_indexed_mapping,parse_command_tokens,render_cli_command, ...) rather than re-validating the YAML shape inline
If your converter needs new directive options (e.g.
:foo_index:):Add the option name to
MODEL_CODE_OPTION_NAMESintools/docs_codegen/scanner.pyAdd the option name to
ModelCodeDirective.option_specintools/docs_codegen/sphinx_extension.py
Add a real example snippet in any model doc (recommended under
docs/source/tutorials/models/) and point it to a YAML file that exists (recommended undertests/).Minimal validation via CLI:
python3 tools/docs_codegen/cli.py --doc <your_doc>or--block <doc>::<block_name>