Skip to content

v0.2.12

Compare
Choose a tag to compare
@aarnphm aarnphm released this 02 Aug 03:12
· 1281 commits to main since this release

News

OpenLLM now release a base container containing all compiled kernels, removing the needs for building kernels with openllm build when using vLLM or auto-gptq

vLLM supports (experimental)

Currently, only OPT and Llama 2 supports vLLM. Simply use OPENLLM_LLAMA_FRAMEWORK=vllm to startup openllm runners with vllm.

Installation

pip install openllm==0.2.11

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.11

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it --entrypoint=/bin/bash -P ghcr.io/bentoml/openllm:0.2.12 openllm --help

Find more information about this release in the CHANGELOG.md

New Contributors

Full Changelog: v0.2.11...v0.2.12