llmcompressor.pipelines.sequential.transformers_helpers
Classes:
-
HFCacheProxy–Proxy that represents an instance of
transformers.cache_utils.Cache. -
HFProxy–Proxy that uses metadata to handle data-dependent control-flow.
-
HFProxyableCacheMeta–Metaclass that creates a class with its main methods wrapped to be proxyable.
-
HFTracer–Tracer that is able to symbolically trace models from the library. To do that, it uses the HFProxy instead of the
Functions:
-
gen_constructor_wrapper–Wraps
targetto be proxyable. Used for tensor creators liketorch.ones,torch.arangeand so on. -
symbolic_trace–Performs symbolic tracing on the model.
HFCacheProxy
HFProxy
Bases: Proxy
Proxy that uses metadata to handle data-dependent control-flow.
HFProxyableCacheMeta
Bases: type
Metaclass that creates a class with its main methods wrapped to be proxyable.
In the same way that objects can be replaced with Proxys during trace-time,
classes can also be replaced with ProxyableClasses during trace-time.
This meta class acts as factory for creating ProxyableClasses of Cache classes.
At trace-time, all references to Cache classes are monkeypatched with references
to ProxyableCache classes. Whenever this class is used to construct a new instance
of a Cache, an instance of HFCacheProxy is generated instead.
Methods:
-
create__new__wrapper–Mirrors
create_wrapper, but only used to override the__new__method.
create__new__wrapper
staticmethod
Mirrors create_wrapper, but only used to override the __new__ method.
Whenever this class is used to construct a new instance of a Cache, an
instance of HFCacheProxy is generated instead.
The HFCacheProxy class allows caches to be traced through the fx graph.
Parameters:
-
orig_cache_cls(type[Cache]) –Cacheclass being proxied -
proxy_factory_fn(Callable[[Node], Proxy]) –function which converts an instance of
Cacheto an instance ofHFCacheProxy
Returns:
-
–
wrapper function used to replace the
__new__method ofHFProxyableClass
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
HFTracer
Bases: Tracer
Tracer that is able to symbolically trace models from the library. To do that, it uses the HFProxy instead of the regular PyTorch torch.fx.Proxy.
Methods:
-
keys–Called when a proxy object is has the keys() method called.
-
path_of_module–Helper method to find the qualified name of
modin the Module hierarchy ofroot. For example, ifroothas -
trace–Traces
rootand returns the corresponding FXtorch.fx.Graphrepresentation.rootcan either be a
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
keys
Called when a proxy object is has the keys() method called. This is what happens when ** is called on a proxy. This should return an iterator if ** is supposed to work in your custom tracer.
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
path_of_module
Helper method to find the qualified name of mod in the Module hierarchy of root. For example, if root has
a submodule named foo, which has a submodule named bar, passing bar into this function will return the
string "foo.bar".
Args:
mod (str): The Module to retrieve the qualified name for.
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
trace
trace(
root: Module | Callable[..., Any],
concrete_args: dict[str, Any] | None = None,
dummy_inputs: dict[str, Any] | None = None,
complete_concrete_args_with_inputs_not_in_dummy_inputs: bool = True,
) -> Graph
Traces root and returns the corresponding FX torch.fx.Graph representation. root can either be a
torch.nn.Module instance or a Python callable. Note that after this call, self.root may be different from
the root passed in here. For example, when a free function is passed to trace(), we will create a
torch.nn.Module instance to use as the root and add embedded constants to.
Args:
root (torch.nn.Module or Callable):
Either a torch.nn.Module`` or a function to be traced through. If root is not a
[~transformers.PreTrainedModel], thendummy_inputsmust be passed, otherwise tracing will fail.
concrete_args (dict[str, Any], optional):
Concrete arguments that should not be treated as Proxies
dummy_inputs (dict[str, Any], optional):
The dummy inputs needed to handle data-dependent control-flow if root is not a
[~transformers.PreTrainedModel]. It can also be used when root is a
[~transformers.PreTrainedModel] to specify custom dummy inputs for a subset or all the model inputs.
complete_concrete_args_with_inputs_not_in_dummy_inputs (bool, optional, defaults to True):
If True, and dummy_inputs is specified, every argument that root can take that is not in
dummy_inputs and not in concrete_args will be added to concrete_args, otherwise does nothing.
Returns:
torch.fx.Graph:
A FX torch.fx.Graph representing the semantics of the passed-in root.
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 | |
gen_constructor_wrapper
Wraps target to be proxyable. Used for tensor creators like torch.ones, torch.arange and so on.
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
symbolic_trace
symbolic_trace(
model: PreTrainedModel,
input_names: list[str] | None = None,
disable_check: bool = False,
tracer_cls: type[HFTracer] = HFTracer,
) -> GraphModule
Performs symbolic tracing on the model.
Args:
model ([PretrainedModel]):
The model to trace.
input_names (list[str], optional):
The names of the inputs of the traced model. If unset, model.dummy_inputs.keys() are used instead.
disable_check (bool, optional, defaults to False):
If True, no check is done before trying to trace the model, this is mostly usesul for debugging purposes.
tracer_cls (Type[HFTracer], optional, defaults to HFTracer):
The tracer class to use for instantiating the tracer. If unset, HFTracer is used instead.
Returns:
torch.fx.GraphModule: A GraphModule constructed by recording operations seen while tracing the model.
Example:
```python
from transformers.utils.fx import symbolic_trace
traced_model = symbolic_trace(model, input_names=["input_ids", "attention_mask", "token_type_ids"])
```
Source code in src/llmcompressor/pipelines/sequential/transformers_helpers.py
1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 | |