Dynin-Omni Online Serving Example¶
Source https://github.com/vllm-project/vllm-omni/tree/main/examples/online_serving/dynin_omni.
Installation¶
Please refer to README.md.
Launch the Server¶
First, find the transformers_modules path:
Then export it for both PYTHONPATH and HF_MODULES_CACHE:
export PYTHONPATH=<transformers_modules_path>:$PYTHONPATH
export HF_MODULES_CACHE=<transformers_modules_path>
Run from repository root. The 3-stage pipeline is resolved automatically from the model type (vllm_omni/deploy/dynin_omni.yaml), so no config path is required. Pass --deploy-config only to override the defaults:
If vllm-omni is not in PATH, run:
PYTHONPATH="$(pwd)" python -m vllm_omni.entrypoints.cli.main serve snu-aidas/Dynin-Omni \
--omni \
--port 8091
Wait until the server logs show both All stages initialized successfully and Application startup complete. before sending requests.
Send Requests via Python Client¶
Move to the example directory:
Text -> Image¶
python openai_chat_completion_client_for_multimodal_generation.py \
--query-type t2i \
--prompt "A realistic indoor living room with natural daylight."
Image -> Image¶
python openai_chat_completion_client_for_multimodal_generation.py \
--query-type i2i \
--image-path ../../offline_inference/dynin_omni/data/image/sofa_under_water.jpg \
--prompt "Transform this surreal underwater setting into a realistic indoor living room while preserving the sofa layout."
Text -> Speech¶
python openai_chat_completion_client_for_multimodal_generation.py \
--query-type t2s \
--prompt "Hello. This is Dynin-omni."
CLI Arguments¶
--query-type(t2i|t2s|i2i)--model(default:snu-aidas/Dynin-Omni)--host/--port(OpenAI-compatible vLLM endpoint)--prompt(custom text)--image-path(required fori2i)--modalities(optional output modalities override)--output-dir(default:/tmp/dynin_online_outputs)
Notes¶
- This client currently supports only
t2i,t2s, andi2i. t2tis intentionally not exposed in this online example.- This example intentionally uses the OpenAI-compatible chat completion endpoint.
- Task routing for non-text outputs relies on Dynin task trigger tokens (
<|t2i|>,<|i2i|>,<|t2s|>) injected by the client. - Outputs are saved under
/tmp/dynin_online_outputsby default. - Dynin stage-0 warmup can take a while on first startup; do not send requests before startup completes.
- Dynin itself can execute text-returning tasks such as
t2t,s2t,i2t, andv2t, but this online serving example currently runs stage-0 ingenerationmode. In that path, the generation worker does not surface the final text asoutput.text, so OpenAI chat responses for those text-output tasks may complete internally but still return empty text.
Example materials¶
openai_chat_completion_client_for_multimodal_generation.py
Large file omitted from the rendered docs. View it on GitHub: https://github.com/vllm-project/vllm-omni/blob/main/examples/online_serving/dynin_omni/openai_chat_completion_client_for_multimodal_generation.py.