`vllm.v1.kv_offload.tiering.base` ¶

Abstract interfaces and data types for the secondary tiering layer.

Classes:

JobMetadata –

Metadata for an in-flight async transfer job.
JobResult –

Result of an async transfer job (successful or failed).
ParentManager –

Interface for secondary tiers to call back into the tiering manager.
SecondaryTierManager –

Abstract interface for managing a single non-primary offloading tier.
TieringOffloadingMetrics –

Metric names for TieringOffloadingManager.

`JobMetadata` `dataclass` ¶

Metadata for an in-flight async transfer job.

Source code in vllm/v1/kv_offload/tiering/base.py

@dataclass
class JobMetadata:
    """Metadata for an in-flight async transfer job."""

    job_id: JobId
    keys: Collection[OffloadKey]
    block_ids: np.ndarray
    is_promotion: bool
    req_context: ReqContext

`JobResult` `dataclass` ¶

Result of an async transfer job (successful or failed).

Source code in vllm/v1/kv_offload/tiering/base.py

@dataclass
class JobResult:
    """Result of an async transfer job (successful or failed)."""

    job_id: JobId
    success: bool

`ParentManager` ¶

Bases: ABC

Interface for secondary tiers to call back into the tiering manager.

Passed to secondary tiers via serve_external_requests() each step. The _SecondaryTierFacingParent wrapper implements this, automatically excluding the calling tier from fan-out operations.

Required call sequence for each remote request

on_new_request(req_context) — set up per-request state
lookup(key, req_context) — check block availability (repeat per block)
create_store_job(keys, req_context) — pin blocks and get a job handle
on_request_finished(req_context) — clean up per-request state

Steps 2-3 may be interleaved. Step 4 must be called even if no blocks were found, to avoid leaking async lookup state (e.g. in the fs tier's AsyncLookupManager).

Source code in vllm/v1/kv_offload/tiering/base.py

class ParentManager(ABC):
    """Interface for secondary tiers to call back into the tiering manager.

    Passed to secondary tiers via serve_external_requests() each step.
    The _SecondaryTierFacingParent wrapper implements this, automatically
    excluding the calling tier from fan-out operations.

    Required call sequence for each remote request:
        1. on_new_request(req_context)  — set up per-request state
        2. lookup(key, req_context)     — check block availability
           (repeat per block)
        3. create_store_job(keys, req_context) — pin blocks and get a
           job handle
        4. on_request_finished(req_context) — clean up per-request state

    Steps 2-3 may be interleaved. Step 4 must be called even if no
    blocks were found, to avoid leaking async lookup state (e.g. in
    the fs tier's AsyncLookupManager).
    """

    @abstractmethod
    def on_new_request(self, req_context: ReqContext) -> RequestOffloadingContext: ...

    @abstractmethod
    def lookup(self, key: OffloadKey, req_context: ReqContext) -> LookupResult: ...

    @abstractmethod
    def create_store_job(
        self,
        keys: Collection[OffloadKey],
        req_context: ReqContext,
    ) -> JobMetadata: ...

    @abstractmethod
    def on_request_finished(self, req_context: ReqContext) -> None: ...

`SecondaryTierManager` ¶

Bases: ABC

Abstract interface for managing a single non-primary offloading tier.

Secondary tiers cannot directly access GPU memory. All data transfers must go through the CPU (primary) tier: - Store: GPU → CPU (primary) → secondary (cascade) - Load: secondary → CPU (primary) → GPU (promotion)

IMPORTANT: All methods run in the Scheduler process and must be lightweight and non-blocking. submit_load() and submit_store() submit async jobs; get_finished_jobs() polls for completion.

Methods:

__init__ –

Args:
build_metric_definitions –

Return Prometheus metric definitions emitted by this tier.
drain_jobs –

Block until every submitted load/store job has completed or failed.
get_finished_jobs –

Return all jobs (loads and stores) that completed since the last call.
get_stats –

Return and reset metric observations collected by this tier.
has_pending_work –

Whether this tier needs the engine to keep stepping.
lookup –

Check whether a block exists in this secondary tier.
on_new_request –

Called when a new request is first seen by the scheduler.
on_request_finished –

Called when a request has finished.
on_schedule_end –

Called once at the end of each scheduler step.
serve_external_requests –

Process remotely-originated requests using the parent manager.
shutdown –

Release resources held by this tier (threads, connections, etc.).
submit_load –

Submit an async job to load blocks from this secondary tier to the
submit_store –

Submit an async job to store blocks from the primary tier to this
take_events –

Take KV events for storage state owned by this tier.
touch –

Mark blocks as recently used for eviction policy.

Source code in vllm/v1/kv_offload/tiering/base.py

class SecondaryTierManager(ABC):
    """
    Abstract interface for managing a single non-primary offloading tier.

    Secondary tiers cannot directly access GPU memory. All data transfers
    must go through the CPU (primary) tier:
      - Store: GPU → CPU (primary) → secondary  (cascade)
      - Load:  secondary → CPU (primary) → GPU  (promotion)

    IMPORTANT: All methods run in the Scheduler process and must be
    lightweight and non-blocking. submit_load() and submit_store() submit
    async jobs; get_finished_jobs() polls for completion.
    """

    def __init__(
        self,
        offloading_spec: "OffloadingSpec",
        primary_kv_view: memoryview,
        tier_type: str,
    ) -> None:
        """
        Args:
            offloading_spec: Offloading configuration.
            primary_kv_view: Memoryview of the primary tier's CPU KV cache.
            tier_type: Tier type identifier, set by SecondaryTierFactory
                from the registered tier type.
        """
        self._offloading_spec = offloading_spec
        self._primary_kv_view: memoryview = primary_kv_view
        self.tier_type = tier_type

    @abstractmethod
    def lookup(self, key: OffloadKey, req_context: ReqContext) -> LookupResult:
        """
        Check whether a block exists in this secondary tier.

        Args:
            key: Offload key to look up.
            req_context: per-request context (e.g. kv_transfer_params).

        Returns:
            HIT if the block is present and ready,
            MISS if not found,
            or RETRY if the block is being transferred (retry later).
        """
        pass

    @abstractmethod
    def submit_store(self, job_metadata: JobMetadata) -> None:
        """
        Submit an async job to store blocks from the primary tier to this
        secondary tier.

        This method must be lightweight and non-blocking: allocate metadata
        and submit the transfer, but do NOT perform the data copy on the
        calling thread.

        Preconditions (guaranteed by the framework):
          - ``job_metadata.block_ids`` are valid primary-tier slots, pinned
            (ref-counted) for the duration of the transfer.

        The implementation is responsible for:
          1. Filtering out blocks already present in this tier
          2. Evicting blocks if capacity is needed
          3. Allocating space in this tier
          4. Submitting the async transfer (read from primary via block_ids)

        Report completion via ``get_finished_jobs()``.

        Args:
            job_metadata: Job metadata including job_id, keys, and block_ids
                          identifying the primary-tier slots to read from.
        """
        pass

    @abstractmethod
    def submit_load(self, job_metadata: JobMetadata) -> None:
        """
        Submit an async job to load blocks from this secondary tier to the
        primary tier.

        This method must be lightweight and non-blocking: mark blocks as
        in-flight and submit the transfer, but do NOT perform the data copy
        on the calling thread.

        Preconditions (guaranteed by the framework):
          - ``job_metadata.block_ids`` are allocated primary-tier slots
            ready to receive data.

        The implementation must copy data from this tier into the
        primary-tier slots identified by ``block_ids``.

        Report completion via ``get_finished_jobs()``.

        Args:
            job_metadata: Job metadata including job_id, keys, and block_ids
                          identifying the primary-tier slots to write into.
        """
        pass

    @abstractmethod
    def get_finished_jobs(self) -> Iterable[JobResult]:
        """
        Return all jobs (loads and stores) that completed since the last call.

        The framework uses these results to release resources and finalize
        transfers.

        Returns:
            Iterable of JobResult objects for jobs finished since the
            last call.
        """
        pass

    def has_pending_work(self) -> bool:
        """Whether this tier needs the engine to keep stepping.

        While True, on_schedule_end() and get_finished_jobs() continue
        to be called even when no requests are scheduled.
        """
        return False

    def take_events(self) -> Iterable[OffloadingEvent]:
        """Take KV events for storage state owned by this tier."""
        return ()

    def touch(self, keys: Collection[OffloadKey], req_context: ReqContext):
        """
        Mark blocks as recently used for eviction policy.

        Args:
            keys: Offload keys to mark as recently used.
            req_context: Per-request context.
        """
        return

    @abstractmethod
    def on_new_request(self, req_context: ReqContext) -> RequestOffloadingContext:
        """
        Called when a new request is first seen by the scheduler.

        Returns a RequestOffloadingContext expressing this tier's preference
        for how blocks should be offloaded for this request.

        Args:
            req_context: Per-request context.
        """
        pass

    def on_request_finished(self, req_context: ReqContext) -> None:
        """
        Called when a request has finished.

        By the time this is called, all per-request calls for this request
        (submit_store, submit_load, touch) have already been issued, and none
        will follow. Note this does NOT imply the tier's transfers have
        completed: jobs already submitted may still be in flight and will
        report via get_finished_jobs(). This is the right place to release
        per-request bookkeeping.

        Args:
            req_context: per-request context.
        """
        return

    def serve_external_requests(self, parent: ParentManager) -> None:
        """Process remotely-originated requests using the parent manager.

        Called once per scheduler step, BEFORE _flush_pending_promotions().
        The parent handle is valid only for the duration of this call.
        Tiers that don't serve external requests leave this as a no-op.
        """
        return

    def on_schedule_end(self, context: ScheduleEndContext) -> None:
        """Called once at the end of each scheduler step.

        Args:
            context: Per-step context from the scheduler.
        """
        return

    @abstractmethod
    def drain_jobs(self) -> None:
        """Block until every submitted load/store job has completed or failed.

        After this returns, no tier I/O is touching the primary memoryview,
        and every submitted job's result is available from `get_finished_jobs()`
        (yielded by a prior call or queued for the next one). Used by
        `TieringOffloadingManager.reset_cache` to release primary slots
        without racing with in-flight transfers.

        Implementations must not abort a mid-flight transfer: a partial copy
        would corrupt either the primary memoryview or the secondary backing
        store. Queued (not-yet-started) transfers may be cancelled, but their
        failure result must still appear in `get_finished_jobs()`.
        """
        pass

    def shutdown(self) -> None:
        """Release resources held by this tier (threads, connections, etc.)."""
        return

    @classmethod
    def build_metric_definitions(
        cls, extra_config: dict[str, Any]
    ) -> dict[str, OffloadingMetricMetadata]:
        """Return Prometheus metric definitions emitted by this tier."""
        return {}

    def get_stats(self) -> "OffloadingConnectorStats | None":
        """Return and reset metric observations collected by this tier."""
        return None

`init(offloading_spec, primary_kv_view, tier_type)` ¶

Parameters:

offloading_spec ¶
(OffloadingSpec) –

Offloading configuration.
primary_kv_view ¶
(memoryview) –

Memoryview of the primary tier's CPU KV cache.
tier_type ¶
(str) –

Tier type identifier, set by SecondaryTierFactory from the registered tier type.

Source code in vllm/v1/kv_offload/tiering/base.py

def __init__(
    self,
    offloading_spec: "OffloadingSpec",
    primary_kv_view: memoryview,
    tier_type: str,
) -> None:
    """
    Args:
        offloading_spec: Offloading configuration.
        primary_kv_view: Memoryview of the primary tier's CPU KV cache.
        tier_type: Tier type identifier, set by SecondaryTierFactory
            from the registered tier type.
    """
    self._offloading_spec = offloading_spec
    self._primary_kv_view: memoryview = primary_kv_view
    self.tier_type = tier_type

`build_metric_definitions(extra_config)` `classmethod` ¶

Return Prometheus metric definitions emitted by this tier.

Source code in vllm/v1/kv_offload/tiering/base.py

@classmethod
def build_metric_definitions(
    cls, extra_config: dict[str, Any]
) -> dict[str, OffloadingMetricMetadata]:
    """Return Prometheus metric definitions emitted by this tier."""
    return {}

`drain_jobs()` `abstractmethod` ¶

Block until every submitted load/store job has completed or failed.

After this returns, no tier I/O is touching the primary memoryview, and every submitted job's result is available from get_finished_jobs() (yielded by a prior call or queued for the next one). Used by TieringOffloadingManager.reset_cache to release primary slots without racing with in-flight transfers.

Implementations must not abort a mid-flight transfer: a partial copy would corrupt either the primary memoryview or the secondary backing store. Queued (not-yet-started) transfers may be cancelled, but their failure result must still appear in get_finished_jobs().

Source code in vllm/v1/kv_offload/tiering/base.py

@abstractmethod
def drain_jobs(self) -> None:
    """Block until every submitted load/store job has completed or failed.

    After this returns, no tier I/O is touching the primary memoryview,
    and every submitted job's result is available from `get_finished_jobs()`
    (yielded by a prior call or queued for the next one). Used by
    `TieringOffloadingManager.reset_cache` to release primary slots
    without racing with in-flight transfers.

    Implementations must not abort a mid-flight transfer: a partial copy
    would corrupt either the primary memoryview or the secondary backing
    store. Queued (not-yet-started) transfers may be cancelled, but their
    failure result must still appear in `get_finished_jobs()`.
    """
    pass

`get_finished_jobs()` `abstractmethod` ¶

Return all jobs (loads and stores) that completed since the last call.

The framework uses these results to release resources and finalize transfers.

Returns:

Iterable[JobResult] –

Iterable of JobResult objects for jobs finished since the
Iterable[JobResult] –

last call.

Source code in vllm/v1/kv_offload/tiering/base.py

@abstractmethod
def get_finished_jobs(self) -> Iterable[JobResult]:
    """
    Return all jobs (loads and stores) that completed since the last call.

    The framework uses these results to release resources and finalize
    transfers.

    Returns:
        Iterable of JobResult objects for jobs finished since the
        last call.
    """
    pass

`get_stats()` ¶

Return and reset metric observations collected by this tier.

Source code in vllm/v1/kv_offload/tiering/base.py

def get_stats(self) -> "OffloadingConnectorStats | None":
    """Return and reset metric observations collected by this tier."""
    return None

`has_pending_work()` ¶

Whether this tier needs the engine to keep stepping.

While True, on_schedule_end() and get_finished_jobs() continue to be called even when no requests are scheduled.

Source code in vllm/v1/kv_offload/tiering/base.py

def has_pending_work(self) -> bool:
    """Whether this tier needs the engine to keep stepping.

    While True, on_schedule_end() and get_finished_jobs() continue
    to be called even when no requests are scheduled.
    """
    return False

`lookup(key, req_context)` `abstractmethod` ¶

Check whether a block exists in this secondary tier.

Parameters:

key ¶
(OffloadKey) –

Offload key to look up.
req_context ¶
(ReqContext) –

per-request context (e.g. kv_transfer_params).

Returns:

LookupResult –

HIT if the block is present and ready,
LookupResult –

MISS if not found,
LookupResult –

or RETRY if the block is being transferred (retry later).

Source code in vllm/v1/kv_offload/tiering/base.py

@abstractmethod
def lookup(self, key: OffloadKey, req_context: ReqContext) -> LookupResult:
    """
    Check whether a block exists in this secondary tier.

    Args:
        key: Offload key to look up.
        req_context: per-request context (e.g. kv_transfer_params).

    Returns:
        HIT if the block is present and ready,
        MISS if not found,
        or RETRY if the block is being transferred (retry later).
    """
    pass

`on_new_request(req_context)` `abstractmethod` ¶

Called when a new request is first seen by the scheduler.

Returns a RequestOffloadingContext expressing this tier's preference for how blocks should be offloaded for this request.

Parameters:

req_context ¶
(ReqContext) –

Per-request context.

Source code in vllm/v1/kv_offload/tiering/base.py

@abstractmethod
def on_new_request(self, req_context: ReqContext) -> RequestOffloadingContext:
    """
    Called when a new request is first seen by the scheduler.

    Returns a RequestOffloadingContext expressing this tier's preference
    for how blocks should be offloaded for this request.

    Args:
        req_context: Per-request context.
    """
    pass

`on_request_finished(req_context)` ¶

Called when a request has finished.

By the time this is called, all per-request calls for this request (submit_store, submit_load, touch) have already been issued, and none will follow. Note this does NOT imply the tier's transfers have completed: jobs already submitted may still be in flight and will report via get_finished_jobs(). This is the right place to release per-request bookkeeping.

Parameters:

req_context ¶
(ReqContext) –

per-request context.

Source code in vllm/v1/kv_offload/tiering/base.py

def on_request_finished(self, req_context: ReqContext) -> None:
    """
    Called when a request has finished.

    By the time this is called, all per-request calls for this request
    (submit_store, submit_load, touch) have already been issued, and none
    will follow. Note this does NOT imply the tier's transfers have
    completed: jobs already submitted may still be in flight and will
    report via get_finished_jobs(). This is the right place to release
    per-request bookkeeping.

    Args:
        req_context: per-request context.
    """
    return

`on_schedule_end(context)` ¶

Called once at the end of each scheduler step.

Parameters:

context ¶
(ScheduleEndContext) –

Per-step context from the scheduler.

Source code in vllm/v1/kv_offload/tiering/base.py

def on_schedule_end(self, context: ScheduleEndContext) -> None:
    """Called once at the end of each scheduler step.

    Args:
        context: Per-step context from the scheduler.
    """
    return

`serve_external_requests(parent)` ¶

Process remotely-originated requests using the parent manager.

Called once per scheduler step, BEFORE _flush_pending_promotions(). The parent handle is valid only for the duration of this call. Tiers that don't serve external requests leave this as a no-op.

Source code in vllm/v1/kv_offload/tiering/base.py

def serve_external_requests(self, parent: ParentManager) -> None:
    """Process remotely-originated requests using the parent manager.

    Called once per scheduler step, BEFORE _flush_pending_promotions().
    The parent handle is valid only for the duration of this call.
    Tiers that don't serve external requests leave this as a no-op.
    """
    return

`shutdown()` ¶

Release resources held by this tier (threads, connections, etc.).

Source code in vllm/v1/kv_offload/tiering/base.py

def shutdown(self) -> None:
    """Release resources held by this tier (threads, connections, etc.)."""
    return

`submit_load(job_metadata)` `abstractmethod` ¶

Submit an async job to load blocks from this secondary tier to the primary tier.

This method must be lightweight and non-blocking: mark blocks as in-flight and submit the transfer, but do NOT perform the data copy on the calling thread.

Preconditions (guaranteed by the framework): - job_metadata.block_ids are allocated primary-tier slots ready to receive data.

The implementation must copy data from this tier into the primary-tier slots identified by block_ids.

Report completion via get_finished_jobs().

Parameters:

job_metadata ¶
(JobMetadata) –

Job metadata including job_id, keys, and block_ids identifying the primary-tier slots to write into.

Source code in vllm/v1/kv_offload/tiering/base.py

@abstractmethod
def submit_load(self, job_metadata: JobMetadata) -> None:
    """
    Submit an async job to load blocks from this secondary tier to the
    primary tier.

    This method must be lightweight and non-blocking: mark blocks as
    in-flight and submit the transfer, but do NOT perform the data copy
    on the calling thread.

    Preconditions (guaranteed by the framework):
      - ``job_metadata.block_ids`` are allocated primary-tier slots
        ready to receive data.

    The implementation must copy data from this tier into the
    primary-tier slots identified by ``block_ids``.

    Report completion via ``get_finished_jobs()``.

    Args:
        job_metadata: Job metadata including job_id, keys, and block_ids
                      identifying the primary-tier slots to write into.
    """
    pass

`submit_store(job_metadata)` `abstractmethod` ¶

Submit an async job to store blocks from the primary tier to this secondary tier.

This method must be lightweight and non-blocking: allocate metadata and submit the transfer, but do NOT perform the data copy on the calling thread.

Preconditions (guaranteed by the framework): - job_metadata.block_ids are valid primary-tier slots, pinned (ref-counted) for the duration of the transfer.

The implementation is responsible for

Filtering out blocks already present in this tier
Evicting blocks if capacity is needed
Allocating space in this tier
Submitting the async transfer (read from primary via block_ids)

Report completion via get_finished_jobs().

Parameters:

job_metadata ¶
(JobMetadata) –

Job metadata including job_id, keys, and block_ids identifying the primary-tier slots to read from.

Source code in vllm/v1/kv_offload/tiering/base.py

@abstractmethod
def submit_store(self, job_metadata: JobMetadata) -> None:
    """
    Submit an async job to store blocks from the primary tier to this
    secondary tier.

    This method must be lightweight and non-blocking: allocate metadata
    and submit the transfer, but do NOT perform the data copy on the
    calling thread.

    Preconditions (guaranteed by the framework):
      - ``job_metadata.block_ids`` are valid primary-tier slots, pinned
        (ref-counted) for the duration of the transfer.

    The implementation is responsible for:
      1. Filtering out blocks already present in this tier
      2. Evicting blocks if capacity is needed
      3. Allocating space in this tier
      4. Submitting the async transfer (read from primary via block_ids)

    Report completion via ``get_finished_jobs()``.

    Args:
        job_metadata: Job metadata including job_id, keys, and block_ids
                      identifying the primary-tier slots to read from.
    """
    pass

`take_events()` ¶

Take KV events for storage state owned by this tier.

Source code in vllm/v1/kv_offload/tiering/base.py

def take_events(self) -> Iterable[OffloadingEvent]:
    """Take KV events for storage state owned by this tier."""
    return ()

`touch(keys, req_context)` ¶

Mark blocks as recently used for eviction policy.

Parameters:

keys ¶
(Collection[OffloadKey]) –

Offload keys to mark as recently used.
req_context ¶
(ReqContext) –

Per-request context.

Source code in vllm/v1/kv_offload/tiering/base.py

def touch(self, keys: Collection[OffloadKey], req_context: ReqContext):
    """
    Mark blocks as recently used for eviction policy.

    Args:
        keys: Offload keys to mark as recently used.
        req_context: Per-request context.
    """
    return

`TieringOffloadingMetrics` ¶

Metric names for TieringOffloadingManager.

Source code in vllm/v1/kv_offload/tiering/base.py

class TieringOffloadingMetrics:
    """Metric names for TieringOffloadingManager."""

    LOOKUP_SYNC_DELAY = "vllm:kv_offload_tiering_lookup_sync_delay_seconds"
    LOOKUP_ASYNC_DELAY = "vllm:kv_offload_tiering_lookup_async_delay_seconds"

`vllm.v1.kv_offload.tiering.base` ¶

`JobMetadata` `dataclass` ¶

`JobResult` `dataclass` ¶

`ParentManager` ¶

`SecondaryTierManager` ¶

`init(offloading_spec, primary_kv_view, tier_type)` ¶

`offloading_spec` ¶

`primary_kv_view` ¶

`tier_type` ¶

`build_metric_definitions(extra_config)` `classmethod` ¶

`drain_jobs()` `abstractmethod` ¶

`get_finished_jobs()` `abstractmethod` ¶

`get_stats()` ¶

`has_pending_work()` ¶

`lookup(key, req_context)` `abstractmethod` ¶

`key` ¶

`req_context` ¶

`on_new_request(req_context)` `abstractmethod` ¶

`req_context` ¶

`on_request_finished(req_context)` ¶

`req_context` ¶

`on_schedule_end(context)` ¶

`context` ¶

`serve_external_requests(parent)` ¶

`shutdown()` ¶

`submit_load(job_metadata)` `abstractmethod` ¶

`job_metadata` ¶

`submit_store(job_metadata)` `abstractmethod` ¶

`job_metadata` ¶

`take_events()` ¶

`touch(keys, req_context)` ¶

`keys` ¶

`req_context` ¶

`TieringOffloadingMetrics` ¶

vllm.v1.kv_offload.tiering.base ¶

JobMetadata dataclass ¶

JobResult dataclass ¶

ParentManager ¶

SecondaryTierManager ¶

__init__(offloading_spec, primary_kv_view, tier_type) ¶

offloading_spec ¶

primary_kv_view ¶

tier_type ¶

build_metric_definitions(extra_config) classmethod ¶

drain_jobs() abstractmethod ¶

get_finished_jobs() abstractmethod ¶

get_stats() ¶

has_pending_work() ¶

lookup(key, req_context) abstractmethod ¶

key ¶

req_context ¶

on_new_request(req_context) abstractmethod ¶

req_context ¶

on_request_finished(req_context) ¶

req_context ¶

on_schedule_end(context) ¶

context ¶

serve_external_requests(parent) ¶

shutdown() ¶

submit_load(job_metadata) abstractmethod ¶

job_metadata ¶

submit_store(job_metadata) abstractmethod ¶

job_metadata ¶

take_events() ¶

touch(keys, req_context) ¶

keys ¶

req_context ¶

TieringOffloadingMetrics ¶

`vllm.v1.kv_offload.tiering.base` ¶

`JobMetadata` `dataclass` ¶

`JobResult` `dataclass` ¶

`ParentManager` ¶

`SecondaryTierManager` ¶

`init(offloading_spec, primary_kv_view, tier_type)` ¶

`offloading_spec` ¶

`primary_kv_view` ¶

`tier_type` ¶

`build_metric_definitions(extra_config)` `classmethod` ¶

`drain_jobs()` `abstractmethod` ¶

`get_finished_jobs()` `abstractmethod` ¶

`get_stats()` ¶

`has_pending_work()` ¶

`lookup(key, req_context)` `abstractmethod` ¶

`key` ¶

`req_context` ¶

`on_new_request(req_context)` `abstractmethod` ¶

`req_context` ¶

`on_request_finished(req_context)` ¶

`req_context` ¶

`on_schedule_end(context)` ¶

`context` ¶

`serve_external_requests(parent)` ¶

`shutdown()` ¶

`submit_load(job_metadata)` `abstractmethod` ¶

`job_metadata` ¶

`submit_store(job_metadata)` `abstractmethod` ¶

`job_metadata` ¶

`take_events()` ¶

`touch(keys, req_context)` ¶

`keys` ¶

`req_context` ¶

`TieringOffloadingMetrics` ¶