vllm_omni.diffusion.utils.media_utils ¶
Video/audio muxing utilities using PyAV (no ffmpeg binary dependency).
mux_video_audio_bytes ¶
mux_video_audio_bytes(
video_frames: ndarray,
audio_waveform: ndarray | None = None,
*,
fps: float = 25.0,
audio_sample_rate: int = 44100,
video_codec: str = "h264",
audio_codec: str = "aac",
crf: str = "18",
video_codec_options: dict[str, str] | None = None,
) -> bytes
Mux video frames and optional audio waveform into MP4 bytes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
video_frames | ndarray | uint8 array of shape | required |
audio_waveform | ndarray | None | float32 array – mono | None |
fps | float | Video frame rate. | 25.0 |
audio_sample_rate | int | Audio sample rate in Hz. | 44100 |
video_codec | str | Video codec name. | 'h264' |
audio_codec | str | Audio codec name. | 'aac' |
crf | str | Constant rate factor for the video encoder. | '18' |
Returns:
| Type | Description |
|---|---|
bytes | Raw MP4 bytes ready to be written to disk or streamed. |