vllm_gaudi.ops.hpu_lora
¶
HPULogitsProcessorWithLoRA
¶
Bases: LogitsProcessorWithLoRA
Source code in vllm_gaudi/ops/hpu_lora.py
_get_logits
¶
_get_logits(
hidden_states: Tensor,
lm_head: VocabParallelEmbedding,
embedding_bias: Optional[Tensor] = None,
) -> Optional[Tensor]
Source code in vllm_gaudi/ops/hpu_lora.py
HPUVocabParallelEmbeddingWithLoRA
¶
Bases: VocabParallelEmbeddingWithLoRA