ResLoRA: Identity Residual Mapping in Low-Rank Adaption
- URL: http://arxiv.org/abs/2402.18039v1
- Date: Wed, 28 Feb 2024 04:33:20 GMT
- Title: ResLoRA: Identity Residual Mapping in Low-Rank Adaption
- Authors: Shuhua Shi, Shaohan Huang, Minghui Song, Zhoujun Li, Zihan Zhang,
Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang
- Abstract summary: We propose ResLoRA, an improved framework of low-rank adaptation (LoRA)
Our method can achieve better results in fewer training steps without any extra trainable parameters or inference cost compared to LoRA.
The experiments on NLG, NLU, and text-to-image tasks demonstrate the effectiveness of our method.
- Score: 96.59370314485074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As one of the most popular parameter-efficient fine-tuning (PEFT) methods,
low-rank adaptation (LoRA) is commonly applied to fine-tune large language
models (LLMs). However, updating the weights of LoRA blocks effectively and
expeditiously is challenging due to the long calculation path in the original
model. To address this, we propose ResLoRA, an improved framework of LoRA. By
adding residual paths during training and using merging approaches to eliminate
these extra paths during inference, our method can achieve better results in
fewer training steps without any extra trainable parameters or inference cost
compared to LoRA. The experiments on NLG, NLU, and text-to-image tasks
demonstrate the effectiveness of our method. To the best of our knowledge,
ResLoRA is the first work that combines the residual path with LoRA. The code
of our method is available at
https://github.com/microsoft/LMOps/tree/main/reslora .
Related papers
- MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning [105.11844150736536]
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models.
We propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters.
Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.
arXiv Detail & Related papers (2024-05-20T15:48:32Z) - PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization [39.30090456724925]
Supervised fine-tuning is the most common method to adapt large language models (LLMs) to downstream tasks.
Full fine-tuning requires massive computational resources.
LoRA is one of the most widely used methods, which assumes that the optimization process is essentially low-dimensional.
arXiv Detail & Related papers (2024-02-25T16:43:41Z) - DoRA: Weight-Decomposed Low-Rank Adaptation [57.68678247436207]
We introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA.
Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA)
DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning.
arXiv Detail & Related papers (2024-02-14T17:59:34Z) - Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm.
We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z) - Run LoRA Run: Faster and Lighter LoRA Implementations [50.347242693025336]
LoRA is a technique that reduces the number of trainable parameters in a neural network by introducing low-rank adapters to linear layers.
This paper presents the RunLoRA framework for efficient implementations of LoRA.
Experiments show up to 28% speedup on language modeling networks.
arXiv Detail & Related papers (2023-12-06T10:54:34Z) - Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.
Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters.
Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z) - LoRAPrune: Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning [56.88751562302793]
Low-rank adaption (LoRA) has emerged to fine-tune large language models (LLMs)
LoRAPrune is a new framework that delivers an accurate structured pruned model in a highly memory-efficient manner.
LoRAPrune achieves a reduction in perplexity by 4.81 on WikiText2 and 3.46 on PTB, while also decreasing memory usage by 52.6%.
arXiv Detail & Related papers (2023-05-28T15:15:48Z) - DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic
Search-Free Low-Rank Adaptation [18.922066770467914]
Low-rank adapters (LoRA) keep the main pretrained weights of the model frozen and just introduce some learnable truncated SVD modules to the model.
While LoRA blocks are parameter-efficient, they suffer from two major problems: first, the size of these blocks is fixed and cannot be modified after training.
We introduce a dynamic low-rank adaptation (DyLoRA) technique to address these two problems together.
arXiv Detail & Related papers (2022-10-14T06:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.