AI Everyday
AI Everyday #23 - Hands on & discussion on vLLM - high speed inference engine
Jan 30, 2024
Season 1
Episode 23
Matthew Wallace
Hands on and discussion around vLLM, high performance inference engine supporting continuous batching and paged attention.