vllm_omni.diffusion.models.lance.wan_vae ¶
Wan2.2 VAE used by Lance, ported from upstream so Wan2.2_VAE.pth loads natively without state-dict surgery.
AttentionBlock ¶
AvgDown3D ¶
Bases: Module
CausalConv3d ¶
Decoder3d ¶
Bases: Module
head instance-attribute ¶
head = Sequential(
RMS_norm(out_dim, images=False),
SiLU(),
CausalConv3d(out_dim, 12, 3, padding=1),
)
middle instance-attribute ¶
middle = Sequential(
ResidualBlock(dims[0], dims[0], dropout),
AttentionBlock(dims[0]),
ResidualBlock(dims[0], dims[0], dropout),
)
Down_ResidualBlock ¶
DupUp3D ¶
Bases: Module
Encoder3d ¶
Bases: Module
head instance-attribute ¶
head = Sequential(
RMS_norm(out_dim, images=False),
SiLU(),
CausalConv3d(out_dim, z_dim, 3, padding=1),
)
middle instance-attribute ¶
middle = Sequential(
ResidualBlock(out_dim, out_dim, dropout),
AttentionBlock(out_dim),
ResidualBlock(out_dim, out_dim, dropout),
)
LanceWanVAE ¶
Bases: Module
Wan2.2 VAE wrapped for BAGEL's pipeline.
Exposes BAGEL's image-VAE surface — encode(BCHW) -> BC_zHW and decode(BC_zHW) -> BCHW — by treating each image as a 1-frame video clip. A 5-D encode_video/decode_video path is also provided for the Lance_3B_Video checkpoint.
Construction is lazy: the heavy WanVAE_ and Wan2.2_VAE.pth are not materialized until first use. Once built, the inner module is registered as a submodule so self.parameters(), self.to(device) and vae_dtype = next(vae.parameters()).dtype (used by BAGEL's decode path) all behave.
RMS_norm ¶
Resample ¶
Bases: Module
resample instance-attribute ¶
resample = Sequential(
Upsample(scale_factor=(2.0, 2.0), mode="nearest-exact"),
Conv2d(dim, dim, 3, padding=1),
)
time_conv instance-attribute ¶
time_conv = CausalConv3d(
dim, dim * 2, (3, 1, 1), padding=(1, 0, 0)
)
ResidualBlock ¶
Bases: Module
residual instance-attribute ¶
residual = Sequential(
RMS_norm(in_dim, images=False),
SiLU(),
CausalConv3d(in_dim, out_dim, 3, padding=1),
RMS_norm(out_dim, images=False),
SiLU(),
Dropout(dropout),
CausalConv3d(out_dim, out_dim, 3, padding=1),
)
shortcut instance-attribute ¶
shortcut = (
CausalConv3d(in_dim, out_dim, 1)
if in_dim != out_dim
else Identity()
)
Up_ResidualBlock ¶
WanVAE_ ¶
Bases: Module
Upstream Wan2.2 VAE module — encoder3d/decoder3d sandwich with 2x patchify on input. State-dict-compatible with Wan2.2_VAE.pth.
build_wan22_vae ¶
build_wan22_vae(
vae_path: str, dtype: dtype = bfloat16, device=None
) -> LanceWanVAE
Convenience factory: lazy-construct a :class:LanceWanVAE adapter.