v0.2.12
News
OpenLLM now release a base container containing all compiled kernels, removing the needs for building kernels with openllm build
when using vLLM or auto-gptq
vLLM supports (experimental)
Currently, only OPT and Llama 2 supports vLLM. Simply use OPENLLM_LLAMA_FRAMEWORK=vllm
to startup openllm runners with vllm.
Installation
pip install openllm==0.2.11
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.11
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it --entrypoint=/bin/bash -P ghcr.io/bentoml/openllm:0.2.12 openllm --help
Find more information about this release in the CHANGELOG.md
New Contributors
- @RichardScottOZ made their first contribution in #155
Full Changelog: v0.2.11...v0.2.12