Skip to content

Key Models

The following models are among the most commonly used with LLM Compressor: Llama 4, Qwen3.5, Qwen3.6, Kimi-K2, and Mistral Large 3. Each model page contains quantization examples with tested configurations and recommended parameters.

  • DeepSeek V4


    DeepSeek V4 with HCA, CSA, and mHC, quantized to FP8 + NVFP4

    DeepSeek V4

  • Qwen3.5


    Qwen3.5 vision-language and sparse MoE models.

    Qwen3.5

  • Qwen3.6


    Qwen3.6-35B-A3B sparse MoE model.

    Qwen3.6

  • Kimi-K2.6


    Moonshot AI's latest multimodal agentic model.

    Kimi-K2.6

  • Gemma 4


    Google's latest multimodal model.

    Gemma 4

  • Llama 4


    Meta's Llama 4 Scout multimodal model.

    Llama 4

  • Mistral Large 3


    Mistral's 675B parameter model.

    Mistral Large 3