MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-domain Conversation
We provide core_requirement.txt for your convenience.
The initial models we used are fastchat models (v1.3). Below are the model weights of our fine-tuned version. Our models are built upon Fastchat modles, thus we adopt same cc-by-nc-sa-4.0
license.
Name | Share Link |
---|---|
MemoChat-Fastchat-T5-3B | https://huggingface.co/Junrulu/MemoChat-Fastchat-T5-3B |
MemoChat-Vicuna-7B | https://huggingface.co/Junrulu/MemoChat-Vicuna-7B |
MemoChat-Vicuna-13B | https://huggingface.co/Junrulu/MemoChat-Vicuna-13B |
MemoChat-Vicuna-33B | https://huggingface.co/Junrulu/MemoChat-Vicuna-33B |
RootPath
is the absolute path of this repo. Download initial models and put them in model folder.
Run `bash code/scripts/tuning.sh RootPath`. Intermediate evaluation are included in this script as well.
Run `bash code/scripts/memochat.sh RootPath` for pipeline testing with fine-tuned models.
Run `bash code/scripts/memochat_gpt.sh RootPath` for pipeline testing with GPT3.5 API.
Run `bash code/scripts/llm_judge.sh RootPath` for GPT4 judge (openai api is required).
We provide our prediction results here.
We thank Vicuna project for their great work.
@misc{lu2023memochat,
title={MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation},
author={Junru Lu and Siyu An and Mingbao Lin and Gabriele Pergola and Yulan He and Di Yin and Xing Sun and Yunsheng Wu},
year={2023},
eprint={2308.08239},
archivePrefix={arXiv},
primaryClass={cs.CL}
}