koboldcpp-1.50.1
koboldcpp-1.50.1
- Improved automatic GPU layer selection: In the GUI launcher with CuBLAS, it will now automatically select all layers to do a full GPU offload if it thinks you have enough VRAM to support it.
- Added a short delay to the Abort function in Lite, hopefully fixes the glitches with retry and abort.
- Fixed automatic RoPE values for Yi and Deepseek. If no
--ropeconfig
is set, the preconfigured rope values in the model now take priority over the automatic context rope scale. - The above fix should also allow YaRN RoPE scaled models to work correctly by default, assuming the model has been correctly converted. Note: Customized YaRN configurations flags are not yet available.
- The OpenAI compatible
/v1/completions
has been enhanced, adding extra unofficial parameters that Aphrodite uses, such as Min-P, Top-A and Mirostat. However, OpenAI does not support separatememory
fields or sampler order, so the Kobold API will still give better results there. - SSE streaming support has been added for OpenAI
/v1/completions
endpoint (tested working in SillyTavern) - Custom DALL-E endpoints are now supported, for use with OAI proxies.
- Pulled fixed and improvements from upstream, updated Kobold Lite
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Hotfix 1.50.1:
- Fixed a regression with older RWKV/GPT-2/GPT-J/GPT-NeoX models that caused a segfault.
- If ropeconfig is not set, apply auto linear rope scaling multiplier for rope-tuned models such as Yi when used outside their original context limit.
- Fixed another bug in Lite with the retry/abort button.
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag.