models
Featuredvllm
A high-throughput and memory-efficient inference and serving engine for LLMs
About
A high-throughput and memory-efficient inference and serving engine for LLMs
GitHub Stats
Stars75.8k
Forks15.3k
Watchers0
Open Issues0
Details
LanguagePython
LicenseApache-2.0
Deploymentboth
StatusActive
Repositoryvllm-project/vllm
Last push4/9/2026