speculators.models.dflash
Modules:
-
attention– -
config– -
core– -
metrics–Metrics and loss functions for DFlash draft model.
-
model_definitions– -
utils–Utility functions for DFlash draft model.
Classes:
-
DFlashDraftModel– -
DFlashSpeculatorConfig–Configuration for DFlash speculator with vocabulary mapping.
DFlashDraftModel
Bases: DraftVocabMixin, SpeculatorModel
Methods:
-
from_training_args–Create DFlash model from training arguments.
-
get_trainer_kwargs–Get training and validation kwargs for DFlash.
Source code in speculators/models/dflash/core.py
from_training_args classmethod
from_training_args(
verifier_config: PretrainedConfig,
t2d: Tensor | None = None,
d2t: Tensor | None = None,
**kwargs,
) -> DFlashDraftModel
Create DFlash model from training arguments.
Args: verifier_config: Verifier model configuration. This should be a config with num_hidden_layers set to the number of DRAFT layers (created by create_transformer_layer_config in train.py). t2d: Target-to-draft vocabulary mapping tensor (optional) d2t: Draft-to-target vocabulary mapping tensor (optional) **kwargs: Training arguments with DFlash-specific params - draft_vocab_size: Size of draft vocabulary - block_size: Block size for draft predictions (default: 8) - max_anchors: Max anchor positions during training (default: 256) - verifier_name_or_path: Path to verifier model
Returns: Initialized DFlashDraftModel
Note: The number of draft layers is encoded in verifier_config.num_hidden_layers, following the same pattern as EAGLE3.
Source code in speculators/models/dflash/core.py
get_trainer_kwargs staticmethod
Get training and validation kwargs for DFlash.
Args: **kwargs: Training arguments
Returns: Tuple of (train_call_kwargs, val_call_kwargs)
Source code in speculators/models/dflash/core.py
DFlashSpeculatorConfig
Bases: SpeculatorModelConfig
Configuration for DFlash speculator with vocabulary mapping.
DFlash features vocabulary mapping between draft (64K) and target (128K) vocabularies, enabling cross-tokenizer speculation.
Parameters:
-
–transformer_layer_configConfiguration for the transformer decoder layer
-
–draft_vocab_sizeSize of draft model vocabulary for speculation
Methods:
-
serialize_transformer_config–Serialize transformer config to dict.
-
validate_transformer_config–Validate and convert transformer config.
Attributes:
-
target_vocab_size(int) –Get target vocabulary size from transformer config.
Source code in speculators/config.py
target_vocab_size property
Get target vocabulary size from transformer config.
serialize_transformer_config
validate_transformer_config classmethod
Validate and convert transformer config.