vllm_gaudi.utils
¶
HPUCompileConfig
¶
Configuration class, which holds arguments that will be passed to torch compile with HPU backend.
Source code in vllm_gaudi/utils.py
dynamic
instance-attribute
¶
fullgraph
instance-attribute
¶
__init__
¶
Allow to override the environment variables for corner case scenarios when single functions are compiled with torch.compile decorator. Env variables should not be overwritten when it comes to compilation of the whole model.
Source code in vllm_gaudi/utils.py
get_compile_args
¶
Returns a dictionary of compile arguments that can be used with torch.compile method or decorator
Source code in vllm_gaudi/utils.py
_hpu_get_by_source
¶
async_h2d_copy
¶
Asynchronously transfer data from host to device.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
CPU tensor or raw data to transfer |
required | |
dest_tensor
|
Optional pre-allocated destination tensor |
None
|
|
dtype
|
Required if source is raw data |
None
|
|
device
|
Target device |
'hpu'
|
Returns:
| Type | Description |
|---|---|
|
torch.Tensor on target device |
Source code in vllm_gaudi/utils.py
async_h2d_update
¶
Asynchronously update specific rows of a device tensor from a CPU tensor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
Tensor
|
CPU tensor with data to copy |
required |
dest
|
Tensor
|
Device tensor to update |
required |
indices
|
list[int]
|
List of row indices in dest to update |
required |
device
|
Target device |
'hpu'
|
Source code in vllm_gaudi/utils.py
getattr_nested
¶
Like built-in getattr but supports dot-separated nested attributes.
Examples:
getattr_nested(obj, 'a.b.c') is equivalent to obj.a.b.c getattr_nested(obj, 'a.b', None) returns None when any intermediate or final attribute is missing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
Any
|
Root object. |
required |
name
|
str
|
Dot-separated attribute path. |
required |
*default
|
Any
|
Optional default returned when the attribute is missing.
At most one default value may be provided (same contract as
built-in |
()
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If more than one default value is provided. |
AttributeError
|
If the attribute is missing and no default was given. |
Source code in vllm_gaudi/utils.py
hpu_backend_string
cached
¶
hpu_device_string
cached
¶
make_mrope_positions_tensor_with_pad
¶
make_mrope_positions_tensor_with_pad(
input_positions: list[list[int]],
input_mrope_positions: list[list[list[int]]],
max_prompt_len: int,
pad: int,
) -> list[list[int]]
Source code in vllm_gaudi/utils.py
make_ndarray_with_pad_align
¶
make_ndarray_with_pad_align(
x: list[list[T]],
pad: T,
dtype: DTypeLike,
*,
max_len_align: int = 1024,
) -> NDArray
Make a padded array from 2D inputs.
The padding is applied to the end of each inner list until it reaches
max_len.
Source code in vllm_gaudi/utils.py
make_tensor_with_pad_align
¶
make_tensor_with_pad_align(
x: list[list[T]],
pad: T,
dtype: dtype,
*,
max_len_align: int = 1024,
device: Optional[Union[str, device]] = None,
pin_memory: bool = False,
) -> Tensor
Make a padded tensor from 2D inputs.
The padding is applied to the end of each inner list until it reaches
max_len_aligned, max_len_aligned is max_len rounding to the nearest
max_len_align.
Source code in vllm_gaudi/utils.py
make_tensor_with_pad_hpu
¶
make_tensor_with_pad_hpu(
x: list[list[T]],
pad: T,
dtype: dtype,
*,
max_len: int | None = None,
device: Optional[Union[str, device]] = None,
pin_memory: bool = False,
) -> Tensor
Make a padded tensor from 2D inputs.
The padding is applied to the end of each inner list until it reaches
max_len.
HPU-compatible replacement for make_tensor_with_pad. Uses pure PyTorch (pad_sequence) instead of NumPy.
Source code in vllm_gaudi/utils.py
patch_nixl_utils_for_hpu
¶
Patch vllm.distributed.nixl_utils to use nixl._api instead of rixl._api.
Upstream vLLM gates NIXL imports on is_cuda(), falling back to rixl._api for all other platforms. HPU needs nixl._api (same as CUDA), so we monkey-patch the module-level symbols before anything else imports them.
Source code in vllm_gaudi/utils.py
setattr_nested
¶
Like built-in setattr but supports dot-separated nested attributes.
Examples:
setattr_nested(obj, 'a.b.c', val) is equivalent to obj.a.b.c = val
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
Any
|
Root object. |
required |
name
|
str
|
Dot-separated attribute path. All parts except the last must already exist as attributes. |
required |
value
|
Any
|
Value to assign to the final attribute. |
required |
Raises:
| Type | Description |
|---|---|
AttributeError
|
If any intermediate attribute does not exist. |