DiffusionGemma: The First Diffusion LLM (dLLM) Natively Supported in vLLM
·6 min read
DiffusionGemma is the first diffusion language model (dLLM) supported in vLLM. We integrated it using model runner v2's ModelState abstraction and reused vLLM's speculative decoding to cleanly demonstrate the flexibility of model runner v2 and how future dLLMs may be supported.