You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This request proposes to intergrate MoRe, a PEFT method that combines hardware-efficient, block-diagonal structured matrices (BMM) and low-rankness. The ICML paper "MoRe Fine-Tuning with 10x Fewer Parameters" can be found here https://arxiv.org/abs/2408.17383, of which i'm the first author.
Motivation
PEFT already integrates a simlar method BOFT. In our paper we analyzed in detail that BOFT is a degenerated (in-efficient) case of MoRe. Our method theoretically submerges BOFT, has much higher performance on a range of reasoning tasks with much fewer parameters, uses less than half of BOFT's memory and finetunes faster than LoRA. Llama 7B adapted using our method achives higher score than Llama 13B adapted using LoRA on Commonsense reasoning with 10% of LoRA's parameters.
Your contribution
We have implemented a helper function to easily adapt all modules specified in a config dictionary. Our config file is also quite similar to yours I'll be working towards making them consistent to your codebase and submitting a pull request.
The text was updated successfully, but these errors were encountered:
Thanks for opening this issue and presenting the MoRe method. From a quick glance, this looks like it would indeed be a good addition to PEFT. Let me know if you have questions on how to integrate it. Don't hesitate to create an early draft PR to receive quick feedback.
Feature request
This request proposes to intergrate MoRe, a PEFT method that combines hardware-efficient, block-diagonal structured matrices (BMM) and low-rankness. The ICML paper "MoRe Fine-Tuning with 10x Fewer Parameters" can be found here https://arxiv.org/abs/2408.17383, of which i'm the first author.
Motivation
PEFT already integrates a simlar method BOFT. In our paper we analyzed in detail that BOFT is a degenerated (in-efficient) case of MoRe. Our method theoretically submerges BOFT, has much higher performance on a range of reasoning tasks with much fewer parameters, uses less than half of BOFT's memory and finetunes faster than LoRA. Llama 7B adapted using our method achives higher score than Llama 13B adapted using LoRA on Commonsense reasoning with 10% of LoRA's parameters.
Your contribution
We have implemented a helper function to easily adapt all modules specified in a config dictionary. Our config file is also quite similar to yours I'll be working towards making them consistent to your codebase and submitting a pull request.
The text was updated successfully, but these errors were encountered: