vllm_gaudi.ops
¶
Modules:
| Name | Description |
|---|---|
causal_conv1d_pytorch |
PyTorch reference implementation for the causal conv1d kernels. |
granite_causal_conv1d |
Granite 4.0 specific causal conv1d implementation. |
hpu_attention |
|
hpu_awq |
|
hpu_compressed_tensors |
|
hpu_conv |
|
hpu_fp8 |
|
hpu_fused_moe |
|
hpu_gdn_pytorch |
HPU-native PyTorch implementations for Qwen3.5 GDN ops. |
hpu_gptq |
|
hpu_grouped_topk_router |
|
hpu_layernorm |
|
hpu_lora |
|
hpu_mamba_mixer2 |
|
hpu_mm_encoder_attention |
|
hpu_modelopt |
|
hpu_rotary_embedding |
|
hpu_row_parallel_linear |
|
hpu_weights |
|
ops_selector |
Selector module to switch between PyTorch and Triton implementations |
pytorch_implementation |
|
ssd_combined |
|