
DiffusionGemma: The First Diffusion LLM (dLLM) Natively Supported in vLLM
·6 min read
How vLLM supports DiffusionGemma, the first native diffusion language model in vLLM, using Model Runner V2 state hooks, iterative denoising, bidirectional attention, and reused speculative decoding paths.