Image Generation API¶
vLLM-Omni provides an OpenAI DALL-E compatible API for text-to-image generation using diffusion models.
Each server instance runs a single model (specified at startup via vllm serve <model> --omni).
Quick Start¶
Start the Server¶
For example...
# Qwen-Image
vllm serve Qwen/Qwen-Image --omni --port 8000
# Z-Image Turbo
vllm serve Tongyi-MAI/Z-Image-Turbo --omni --port 8000
Generate Images¶
Using curl:
curl -X POST http://localhost:8000/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"prompt": "a dragon laying over the spine of the Green Mountains of Vermont",
"size": "1024x1024",
"seed": 42
}' | jq -r '.data[0].b64_json' | base64 -d > dragon.png
Using Python:
import requests
import base64
from PIL import Image
import io
response = requests.post(
"http://localhost:8000/v1/images/generations",
json={
"prompt": "a black and white cat wearing a princess tiara",
"size": "1024x1024",
"num_inference_steps": 50,
"seed": 42,
}
)
# Decode and save
img_data = response.json()["data"][0]["b64_json"]
img_bytes = base64.b64decode(img_data)
img = Image.open(io.BytesIO(img_bytes))
img.save("cat.png")
Using OpenAI SDK:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="none")
response = client.images.generate(
model="Qwen/Qwen-Image",
prompt="a horse jumping over a fence nearby a babbling brook",
n=1,
size="1024x1024",
response_format="b64_json"
)
# Note: Extension parameters (seed, steps, cfg) require direct HTTP requests
API Reference¶
Endpoint¶
Request Parameters¶
OpenAI Standard Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | string | required | Text description of the desired image |
model | string | server's model | Model to use (optional, should match server if specified) |
n | integer | 1 | Number of images to generate (1-10) |
size | string | model defaults | Image dimensions in WxH format (e.g., "1024x1024", "512x512") |
response_format | string | "b64_json" | Response format (only "b64_json" supported) |
user | string | null | User identifier for tracking |
vllm-omni Extension Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
negative_prompt | string | null | Text describing what to avoid in the image |
num_inference_steps | integer | model defaults | Number of diffusion steps |
guidance_scale | float | model defaults | Classifier-free guidance scale (typically 0.0-20.0) |
true_cfg_scale | float | model defaults | True CFG scale (model-specific parameter, may be ignored if not supported) |
seed | integer | null | Random seed for reproducibility |
Response Format¶
{
"created": 1701234567,
"data": [
{
"b64_json": "<base64-encoded PNG>",
"url": null,
"revised_prompt": null
}
]
}
Examples¶
Multiple Images¶
curl -X POST http://localhost:8000/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"prompt": "a steampunk city set in a valley of the Adirondack mountains",
"n": 4,
"size": "1024x1024",
"seed": 123
}'
This generates 4 images in a single request.
With Negative Prompt¶
response = requests.post(
"http://localhost:8000/v1/images/generations",
json={
"prompt": "a portrait of a skier in deep powder snow",
"negative_prompt": "blurry, low quality, distorted, ugly",
"num_inference_steps": 100,
"size": "1024x1024",
}
)
Parameter Handling¶
The API passes parameters directly to the diffusion pipeline without model-specific transformation:
- Default values: When parameters are not specified, the underlying model uses its own defaults
- Pass-through design: User-provided values are forwarded directly to the diffusion engine
- Minimal validation: Only basic type checking and range validation at the API level
Parameter Compatibility¶
The API passes parameters directly to the diffusion pipeline without model-specific validation.
- Unsupported parameters may be silently ignored by the model
- Incompatible values will result in errors from the underlying pipeline
- Recommended values vary by model - consult model documentation
Best Practice: Start with the model's recommended parameters, then adjust based on your needs.
Error Responses¶
400 Bad Request¶
Invalid parameters (e.g., model mismatch):
422 Unprocessable Entity¶
Validation errors (missing required fields):
{
"detail": [
{
"loc": ["body", "prompt"],
"msg": "field required",
"type": "value_error.missing"
}
]
}
503 Service Unavailable¶
Diffusion engine not initialized:
Troubleshooting¶
Server Not Running¶
# Check if server is responding
curl http://localhost:8000/v1/images/generations \
-H "Content-Type: application/json" \
-d '{"prompt": "test"}'
Out of Memory¶
If you encounter OOM errors: 1. Reduce image size: "size": "512x512" 2. Reduce inference steps: "num_inference_steps": 25 3. Generate fewer images: "n": 1
Testing¶
Run the test suite to verify functionality:
# All image generation tests
pytest tests/entrypoints/openai_api/test_image_server.py -v
# Specific test
pytest tests/entrypoints/openai_api/test_image_server.py::test_generate_single_image -v
Development¶
Enable debug logging to see prompts and generation details: