vllm_omni.diffusion.model_loader.gguf_adapters.base ¶

GGUFAdapter ¶

Bases: ABC

Base class for model-specific GGUF adapters.

gguf_file `instance-attribute` ¶

gguf_file = gguf_file

model `instance-attribute` ¶

model = model

od_config `instance-attribute` ¶

od_config = od_config

source `instance-attribute` ¶

source = source

is_compatible `staticmethod` ¶

is_compatible(
    od_config: OmniDiffusionConfig,
    model: Module,
    source: ComponentSource,
) -> bool

weights_iterator `abstractmethod` ¶

weights_iterator() -> Generator[
    tuple[str, Tensor], None, None
]

MappedTensor `dataclass` ¶

name `instance-attribute` ¶

name: str

row_slice `class-attribute` `instance-attribute` ¶

row_slice: slice | None = None

swap_scale_shift `class-attribute` `instance-attribute` ¶

swap_scale_shift: bool = False

tensor `instance-attribute` ¶

tensor: Any

tensor_type `instance-attribute` ¶

tensor_type: Any

gguf_quant_weights_iterator ¶

gguf_quant_weights_iterator(
    gguf_file: str,
) -> Generator[tuple[str, Tensor]]

Iterate over the quant weights in the model gguf files and convert them to torch tensors. Be careful of the order of yielding weight types and weights data, we have to yield all weight types first before yielding any weights. Otherwise it would cause issue when loading weights with for packed layer with different quant types.

vllm_omni.diffusion.model_loader.gguf_adapters.base ¶

GGUFAdapter ¶

gguf_file instance-attribute ¶

model instance-attribute ¶

od_config instance-attribute ¶

source instance-attribute ¶

is_compatible staticmethod ¶

weights_iterator abstractmethod ¶

MappedTensor dataclass ¶

name instance-attribute ¶

row_slice class-attribute instance-attribute ¶

swap_scale_shift class-attribute instance-attribute ¶

tensor instance-attribute ¶

tensor_type instance-attribute ¶