v0.2.13

aarnphm released this 03 Aug 06:27

· 1267 commits to main since this release

8428692

What changes?

Fixes auto-gptq kernel CUDA within base container.
Add support for all vLLM models. Update the vllm to latest stable commit.

Full Changelog: v0.2.12...v0.2.13

Assets 2