vllm_omni.diffusion.models.flux.pipeline_flux ¶
FluxDMD2Pipeline ¶
FluxPipeline ¶
Bases: Module, FluxPipelineMixin, CFGParallelMixin, DiffusionPipelineProfilerMixin
scheduler instance-attribute ¶
tokenizer instance-attribute ¶
tokenizer_2 instance-attribute ¶
tokenizer_max_length instance-attribute ¶
tokenizer_max_length = (
model_max_length
if hasattr(self, "tokenizer") and tokenizer is not None
else 77
)
transformer instance-attribute ¶
transformer = FluxTransformer2DModel(
**transformer_kwargs,
od_config=od_config,
quant_config=quantization_config,
)
vae_scale_factor instance-attribute ¶
weights_sources instance-attribute ¶
weights_sources = [
ComponentSource(
model_or_path=model,
subfolder="transformer",
revision=None,
prefix="transformer.",
fall_back_to_pt=True,
),
ComponentSource(
model_or_path=model,
subfolder="text_encoder_2",
revision=None,
prefix="text_encoder_2.",
fall_back_to_pt=True,
),
]
check_cfg_parallel_validity ¶
check_inputs ¶
check_inputs(
prompt,
prompt_2,
height,
width,
negative_prompt=None,
negative_prompt_2=None,
prompt_embeds=None,
negative_prompt_embeds=None,
pooled_prompt_embeds=None,
negative_pooled_prompt_embeds=None,
callback_on_step_end_tensor_inputs=None,
max_sequence_length=None,
)
diffuse ¶
diffuse(
prompt_embeds: Tensor,
pooled_prompt_embeds: Tensor,
negative_prompt_embeds: Tensor,
negative_pooled_prompt_embeds: Tensor,
latents: Tensor,
latent_image_ids: Tensor,
text_ids: Tensor,
negative_text_ids: Tensor,
timesteps: Tensor,
do_true_cfg: bool,
guidance: Tensor,
true_cfg_scale: float,
cfg_normalize: bool = False,
) -> Tensor
Diffusion loop with optional image conditioning.
encode_prompt ¶
encode_prompt(
prompt: str | list[str],
prompt_2: str | list[str],
num_images_per_prompt: int = 1,
prompt_embeds: FloatTensor | None = None,
pooled_prompt_embeds: FloatTensor | None = None,
max_sequence_length: int = 512,
)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt | `str` or `List[str]`, *optional* | prompt to be encoded | required |
prompt_2 | `str` or `List[str]`, *optional* | The prompt or prompts to be sent to the | required |
num_images_per_prompt | `int` | number of images that should be generated per prompt | 1 |
prompt_embeds | `torch.FloatTensor`, *optional* | Pre-generated text embeddings. Can be used to easily tweak text inputs, e.g. prompt weighting. If not provided, text embeddings will be generated from | None |
pooled_prompt_embeds | `torch.FloatTensor`, *optional* | Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, e.g. prompt weighting. If not provided, pooled text embeddings will be generated from | None |
forward ¶
forward(
req: OmniDiffusionRequest,
prompt: str | list[str] | None = None,
prompt_2: str | list[str] | None = None,
negative_prompt: str | list[str] | None = None,
negative_prompt_2: str | list[str] | None = None,
true_cfg_scale: float = 1.0,
height: int | None = None,
width: int | None = None,
num_inference_steps: int = 28,
sigmas: list[float] | None = None,
guidance_scale: float = 3.5,
num_images_per_prompt: int = 1,
generator: Generator | list[Generator] | None = None,
latents: FloatTensor | None = None,
prompt_embeds: FloatTensor | None = None,
pooled_prompt_embeds: FloatTensor | None = None,
negative_prompt_embeds: FloatTensor | None = None,
negative_pooled_prompt_embeds: FloatTensor
| None = None,
output_type: str | None = "pil",
return_dict: bool = True,
joint_attention_kwargs: dict[str, Any] | None = None,
callback_on_step_end_tensor_inputs: list[str] = [
"latents"
],
max_sequence_length: int = 512,
)
Forward pass for flux.
prepare_latents ¶
prepare_latents(
batch_size,
num_channels_latents,
height,
width,
dtype,
device,
generator,
latents=None,
)