vllm_omni.diffusion.models.flux2_klein.pipeline_flux2_klein ¶
Flux2ImageProcessor ¶
Bases: VaeImageProcessor
Image processor to preprocess the reference image for Flux2 klein.
Flux2KleinPipeline ¶
Bases: Module, CFGParallelMixin, SupportImageInput, DiffusionPipelineProfilerMixin
Flux2 klein pipeline for text-to-image generation.
image_processor instance-attribute ¶
image_processor = Flux2ImageProcessor(
vae_scale_factor=vae_scale_factor * 2
)
latent_channels instance-attribute ¶
latent_channels = (
latent_channels if hasattr(vae, "config") else 16
)
mask_processor instance-attribute ¶
mask_processor = VaeImageProcessor(
vae_scale_factor=vae_scale_factor * 2,
vae_latent_channels=latent_channels,
do_normalize=False,
do_binarize=True,
do_convert_grayscale=True,
)
scheduler instance-attribute ¶
tokenizer instance-attribute ¶
transformer instance-attribute ¶
transformer = Flux2Transformer2DModel(
quant_config=quantization_config, **transformer_kwargs
)
vae_scale_factor instance-attribute ¶
weights_sources instance-attribute ¶
weights_sources = [
ComponentSource(
model_or_path=model,
subfolder="transformer",
revision=None,
prefix="transformer.",
fall_back_to_pt=True,
)
]
check_inputs ¶
check_inputs(
prompt,
height,
width,
prompt_embeds=None,
callback_on_step_end_tensor_inputs=None,
guidance_scale=None,
strength=None,
num_inference_steps=None,
)
encode_prompt ¶
encode_prompt(
prompt: str | list[str],
device: device | None = None,
num_images_per_prompt: int = 1,
prompt_embeds: Tensor | None = None,
max_sequence_length: int = 512,
text_encoder_out_layers: tuple[int, ...] = (9, 18, 27),
)
forward ¶
forward(
req: OmniDiffusionRequest,
image: Image | list[Image] | None = None,
reference_image: Image | list[Image] | None = None,
mask_image: Image | list[Image] | None = None,
prompt: str | list[str] | None = None,
height: int | None = None,
width: int | None = None,
num_inference_steps: int = 50,
sigmas: list[float] | None = None,
strength: float = 1.0,
guidance_scale: float | None = 4.0,
num_images_per_prompt: int = 1,
generator: Generator | list[Generator] | None = None,
latents: Tensor | None = None,
prompt_embeds: Tensor | None = None,
negative_prompt_embeds: Tensor | None = None,
output_type: str | None = "pil",
return_dict: bool = True,
attention_kwargs: dict[str, Any] | None = None,
callback_on_step_end: Callable[[int, int, dict], None]
| None = None,
callback_on_step_end_tensor_inputs: list[str] = [
"latents"
],
max_sequence_length: int = 512,
text_encoder_out_layers: tuple[int, ...] = (9, 18, 27),
padding_mask_crop: int | None = None,
) -> DiffusionOutput
Function invoked when calling the pipeline for generation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image | `torch.Tensor`, `PIL.Image.Image`, `np.ndarray`, or list of these |
| None |
prompt | `str` or `List[str]`, *optional* | The prompt or prompts to guide the image generation. If not defined, one has to pass | None |
guidance_scale | `float`, *optional*, defaults to 4.0 | Guidance scale as defined in Classifier-Free Diffusion Guidance. | 4.0 |
height | `int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor | The height in pixels of the generated image. This is set to 1024 by default for the best results. | None |
width | `int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor | The width in pixels of the generated image. This is set to 1024 by default for the best results. | None |
num_inference_steps | `int`, *optional*, defaults to 50 | The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. | 50 |
sigmas | `List[float]`, *optional* | Custom sigmas to use for the denoising process with schedulers which support a | None |
num_images_per_prompt | `int`, *optional*, defaults to 1 | The number of images to generate per prompt. | 1 |
generator | `torch.Generator` or `List[torch.Generator]`, *optional* | One or a list of torch generator(s) to make generation deterministic. | None |
latents | `torch.Tensor`, *optional* | Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image generation. Can be used to tweak the same generation with different prompts. If not provided, a latents tensor will be generated by sampling using the supplied random | None |
prompt_embeds | `torch.Tensor`, *optional* | Pre-generated text embeddings. Can be used to easily tweak text inputs, e.g. prompt weighting. If not provided, text embeddings will be generated from | None |
negative_prompt_embeds | `torch.Tensor`, *optional* | Pre-generated negative text embeddings. Note that "" is used as the negative prompt in this pipeline. If not provided, will be generated from "". | None |
output_type | `str`, *optional*, defaults to `"pil"` | The output format of the generate image. Choose between PIL: | 'pil' |
return_dict | `bool`, *optional*, defaults to `True` | Whether or not to return a [ | True |
attention_kwargs | `dict`, *optional* | A kwargs dictionary that if specified is passed along to the | None |
callback_on_step_end | `Callable`, *optional* | A function that calls at the end of each denoising steps during the inference. The function is called with the following arguments: | None |
callback_on_step_end_tensor_inputs | `List`, *optional* | The list of tensor inputs for the | ['latents'] |
max_sequence_length | `int` defaults to 512 | Maximum sequence length to use with the | 512 |
text_encoder_out_layers | `Tuple[int]` | Layer indices to use in the | (9, 18, 27) |
Examples:
Returns:
| Type | Description |
|---|---|
DiffusionOutput | [ |
DiffusionOutput |
|
DiffusionOutput | generated images. |
prepare_image_latents ¶
prepare_image_latents(
images: list[Tensor],
batch_size,
generator: Generator,
device,
dtype,
)
prepare_latents ¶
prepare_latents(
batch_size,
num_latents_channels,
height,
width,
dtype,
device,
generator: Generator,
latents: Tensor | None = None,
)
prepare_mask_latents ¶
prepare_mask_latents(
mask,
masked_image,
batch_size,
num_channels_latents,
num_images_per_prompt,
height,
width,
dtype,
device,
generator,
)
get_flux2_klein_post_process_func ¶
get_flux2_klein_post_process_func(
od_config: OmniDiffusionConfig,
)