Skip to content

vllm_gaudi.ops

Modules:

Name Description
causal_conv1d_pytorch

PyTorch reference implementation for the causal conv1d kernels.

granite_causal_conv1d

Granite 4.0 specific causal conv1d implementation.

hpu_attention
hpu_awq
hpu_compressed_tensors
hpu_conv
hpu_fp8
hpu_fused_moe
hpu_gdn_pytorch

HPU-native PyTorch implementations for Qwen3.5 GDN ops.

hpu_gptq
hpu_grouped_topk_router
hpu_layernorm
hpu_lora
hpu_mamba_mixer2
hpu_mm_encoder_attention
hpu_modelopt
hpu_rotary_embedding
hpu_row_parallel_linear
hpu_weights
ops_selector

Selector module to switch between PyTorch and Triton implementations

pytorch_implementation
ssd_combined