vllm_omni.diffusion.models.flux2.pipeline_flux2 ¶
Flux2ImageProcessor ¶
Bases: VaeImageProcessor
Image processor to preprocess the reference image for Flux2.
Flux2Pipeline ¶
Bases: Module, CFGParallelMixin, SupportImageInput, ProgressBarMixin, DiffusionPipelineProfilerMixin
Flux2 pipeline for text-to-image generation.
image_processor instance-attribute ¶
image_processor = Flux2ImageProcessor(
vae_scale_factor=vae_scale_factor * 2
)
scheduler instance-attribute ¶
system_message_upsampling_i2i instance-attribute ¶
system_message_upsampling_t2i instance-attribute ¶
tokenizer instance-attribute ¶
transformer instance-attribute ¶
transformer = Flux2Transformer2DModel(
quant_config=quantization_config,
od_config=od_config,
**transformer_kwargs,
)
upsampling_max_image_size instance-attribute ¶
vae_scale_factor instance-attribute ¶
weights_sources instance-attribute ¶
weights_sources = [
ComponentSource(
model_or_path=model,
subfolder="transformer",
revision=None,
prefix="transformer.",
fall_back_to_pt=True,
)
]
check_cfg_parallel_validity ¶
check_inputs ¶
encode_prompt ¶
encode_prompt(
prompt: str | list[str],
device: device | None = None,
num_images_per_prompt: int = 1,
prompt_embeds: Tensor | None = None,
max_sequence_length: int = 512,
text_encoder_out_layers: tuple[int, ...] = (10, 20, 30),
)
forward ¶
forward(
req: OmniDiffusionRequest,
image: Image | list[Image] | None = None,
prompt: str | list[str] | None = None,
height: int | None = None,
width: int | None = None,
num_inference_steps: int = 50,
sigmas: list[float] | None = None,
guidance_scale: float | None = 4.0,
num_images_per_prompt: int = 1,
generator: Generator | list[Generator] | None = None,
latents: Tensor | None = None,
prompt_embeds: Tensor | None = None,
negative_prompt_embeds: Tensor | None = None,
output_type: str | None = "pil",
return_dict: bool = True,
attention_kwargs: dict[str, Any] | None = None,
callback_on_step_end: Callable[[int, int, dict], None]
| None = None,
callback_on_step_end_tensor_inputs: list[str] = [
"latents"
],
max_sequence_length: int = 512,
text_encoder_out_layers: tuple[int, ...] = (10, 20, 30),
caption_upsample_temperature: float = None,
) -> DiffusionOutput
prepare_image_latents ¶
prepare_image_latents(
images: list[Tensor],
batch_size,
generator: Generator,
device,
dtype,
)
prepare_latents ¶
prepare_latents(
batch_size,
num_latents_channels,
height,
width,
dtype,
device,
generator: Generator,
latents: Tensor | None = None,
)
format_input ¶
format_input(
prompts: list[str],
system_message: str = SYSTEM_MESSAGE,
images: list[Image] | list[list[Image]] = None,
) -> list[list[dict[str, Any]]]
Format a batch of text prompts into the conversation format expected by apply_chat_template. Optionally, add images to the input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompts | list[str] | List of text prompts | required |
system_message | str | System message to use (default: CREATIVE_SYSTEM_MESSAGE) | SYSTEM_MESSAGE |
images | optional | List of images to add to the input. | None |
Returns:
| Type | Description |
|---|---|
list[list[dict[str, Any]]] |
|
retrieve_latents ¶
retrieve_latents(
encoder_output: Tensor,
generator: Generator = None,
sample_mode: str = "sample",
)
retrieve_timesteps ¶
retrieve_timesteps(
scheduler,
num_inference_steps: int | None = None,
device: str | device | None = None,
timesteps: list[int] | None = None,
sigmas: list[float] | None = None,
**kwargs,
) -> tuple[Tensor, int]
Calls the scheduler's set_timesteps method and retrieves timesteps from the scheduler after the call. Handles custom timesteps. Any kwargs will be supplied to scheduler.set_timesteps.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scheduler | `SchedulerMixin` | The scheduler to get timesteps from. | required |
num_inference_steps | `int` | The number of diffusion steps used when generating samples with a pre-trained model. If used, | None |
device | `str` or `torch.device`, *optional* | The device to which the timesteps should be moved to. If | None |
timesteps | `List[int]`, *optional* | Custom timesteps used to override the timestep spacing strategy of the scheduler. If | None |
sigmas | `List[float]`, *optional* | Custom sigmas used to override the timestep spacing strategy of the scheduler. If | None |
Returns:
| Type | Description |
|---|---|
Tensor |
|
int | second element is the number of inference steps. |