vllm_omni.diffusion.models.wan2_2.pipeline_wan2_2 ¶
Wan22Pipeline ¶
Bases: Module, PipelineParallelMixin, CFGParallelMixin, ProgressBarMixin, DiffusionPipelineProfilerMixin
tokenizer instance-attribute ¶
tokenizer = from_pretrained_with_prefetch(
from_pretrained,
model,
subfolder="tokenizer",
prefetch_list=component_subfolders,
local_files_only=local_files_only,
)
vae_scale_factor_spatial instance-attribute ¶
vae_scale_factor_spatial = (
scale_factor_spatial
if getattr(self, "vae", None)
else 8
)
vae_scale_factor_temporal instance-attribute ¶
vae_scale_factor_temporal = (
scale_factor_temporal
if getattr(self, "vae", None)
else 4
)
check_inputs ¶
check_inputs(
prompt,
negative_prompt,
height,
width,
prompt_embeds=None,
negative_prompt_embeds=None,
guidance_scale_2=None,
boundary_ratio=None,
)
diffuse ¶
diffuse(
latents: Tensor,
timesteps: Tensor,
prompt_embeds: Tensor,
negative_prompt_embeds: Tensor | None,
guidance_low: float,
guidance_high: float,
boundary_timestep: float | None,
dtype: dtype,
attention_kwargs: dict[str, Any],
latent_condition: Tensor | None = None,
first_frame_mask: Tensor | None = None,
) -> Tensor | AsyncLatents
encode_prompt ¶
encode_prompt(
prompt: str | list[str],
negative_prompt: str | list[str] | None = None,
do_classifier_free_guidance: bool = True,
num_videos_per_prompt: int = 1,
max_sequence_length: int = 512,
device: device | None = None,
dtype: dtype | None = None,
)
forward ¶
forward(
req: OmniDiffusionRequest,
prompt: str | None = None,
negative_prompt: str | None = None,
height: int = 480,
width: int = 832,
num_inference_steps: int = 40,
guidance_scale: float | tuple[float, float] = 4.0,
frame_num: int = 81,
output_type: str | None = "np",
generator: Generator | list[Generator] | None = None,
prompt_embeds: Tensor | None = None,
negative_prompt_embeds: Tensor | None = None,
attention_kwargs: dict | None = None,
**kwargs,
) -> DiffusionOutput
load_weights ¶
Load weights using AutoWeightsLoader for vLLM integration.
predict_noise ¶
predict_noise(
current_model: Module | None = None, **kwargs: Any
) -> Tensor | IntermediateTensors
Forward pass through transformer to predict noise.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_model | Module | None | The transformer model to use (transformer or transformer_2) | None |
**kwargs | Any | Arguments to pass to the transformer | {} |
Returns:
| Type | Description |
|---|---|
Tensor | IntermediateTensors | Predicted noise tensor or IntermediateTensors on non-last PP stages. |
WanT2VDMD2Pipeline ¶
create_transformer_from_config ¶
create_transformer_from_config(
config: dict,
quant_config: QuantizationConfig | None = None,
prefix: str = "",
) -> WanTransformer3DModel
Create WanTransformer3DModel from config dict.
get_wan22_pre_process_func ¶
get_wan22_pre_process_func(od_config: OmniDiffusionConfig)
Pre-process function for Wan2.2: optionally load and resize input image for I2V mode.
load_transformer_config ¶
load_transformer_config(
model_path: str,
subfolder: str = "transformer",
local_files_only: bool = True,
) -> dict
Load transformer config from model directory or HF Hub.
resolve_wan_flow_shift ¶
resolve_wan_flow_shift(
req: OmniDiffusionRequest,
od_config: OmniDiffusionConfig,
) -> float
resolve_wan_sample_solver ¶
resolve_wan_sample_solver(
req: OmniDiffusionRequest, default: str = "unipc"
) -> str