Model Runner V2: A Modular and Faster Core for vLLMMar 24, 2026·8 min readHow Model Runner V2 reworks vLLM's execution core with modular model logic, GPU-native input preparation, stable persistent batching, async-first scheduling, and no API changes.