PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization
- URL: http://arxiv.org/abs/2402.16141v1
- Date: Sun, 25 Feb 2024 16:43:41 GMT
- Title: PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization
- Authors: Xiangdi Meng, Damai Dai, Weiyao Luo, Zhe Yang, Shaoxiang Wu, Xiaochen
Wang, Peiyi Wang, Qingxiu Dong, Liang Chen, Zhifang Sui
- Abstract summary: Supervised fine-tuning is the most common method to adapt large language models (LLMs) to downstream tasks.
Full fine-tuning requires massive computational resources.
LoRA is one of the most widely used methods, which assumes that the optimization process is essentially low-dimensional.
- Score: 39.30090456724925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised fine-tuning is the most common method to adapt large language
models (LLMs) to downstream tasks, but full fine-tuning LLMs requires massive
computational resources. Recently, parameter-efficient fine-tuning (PEFT)
methods have been widely studied due to its cost-effectiveness. LoRA is one of
the most widely used methods, which assumes that the optimization process is
essentially low-dimensional. Although LoRA fine-tuning is effective, there is
still a performance gap compared to full fine-tuning, since its weight update
is limited to low-rank matrices. In order to break the low-rank bottleneck in
LoRA Optimization, we propose PeriodicLoRA (PLoRA), which accumulates low-rank
update matrices multiple times to achieve a higher update rank. PLoRA has
multiple training stages. During each stage, we still update only the LoRA
weights. However, at the end of each stage, we unload the LoRA weights into the
backbone parameters and then reinitialize the LoRA states. Experimental results
show that PLoRA has stronger learning ability, approximately 1.8 times that of
LoRA's learning ability at most, but it does not increase memory usage.
Further, we introduce a momentum-based unloading strategy for PLoRA to mitigate
the training instability.
Related papers
- LoRA Learns Less and Forgets Less [25.09261710396838]
Low-Rank Adaptation (LoRA) is a widely-used parameter-efficient finetuning method for large language models.
We compare the performance of LoRA and full finetuning on two target domains, programming and mathematics.
arXiv Detail & Related papers (2024-05-15T19:27:45Z) - ResLoRA: Identity Residual Mapping in Low-Rank Adaption [96.59370314485074]
We propose ResLoRA, an improved framework of low-rank adaptation (LoRA)
Our method can achieve better results in fewer training steps without any extra trainable parameters or inference cost compared to LoRA.
The experiments on NLG, NLU, and text-to-image tasks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-02-28T04:33:20Z) - LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks [72.88244322513039]
LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
arXiv Detail & Related papers (2024-02-18T04:41:25Z) - DoRA: Weight-Decomposed Low-Rank Adaptation [57.68678247436207]
We introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA.
Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA)
DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning.
arXiv Detail & Related papers (2024-02-14T17:59:34Z) - PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation [65.268245109828]
We introduce PRILoRA, which linearly allocates a different rank for each layer, in an increasing manner, and performs pruning throughout the training process.
We validate the effectiveness of PRILoRA through extensive experiments on eight GLUE benchmarks, setting a new state of the art.
arXiv Detail & Related papers (2024-01-20T20:25:17Z) - Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm.
We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z) - MultiLoRA: Democratizing LoRA for Better Multi-Task Learning [20.750808913757396]
LoRA achieves remarkable resource efficiency and comparable performance when adapting LLMs for specific tasks.
LoRA is dominated by a small number of top singular vectors while fine-tuning decomposes into a set of less important unitary transforms.
We propose MultiLoRA for better multi-task adaptation by reducing the dominance of top singular vectors observed in LoRA.
arXiv Detail & Related papers (2023-11-20T02:59:18Z) - LoRAPrune: Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning [56.88751562302793]
Low-rank adaption (LoRA) has emerged to fine-tune large language models (LLMs)
LoRAPrune is a new framework that delivers an accurate structured pruned model in a highly memory-efficient manner.
LoRAPrune achieves a reduction in perplexity by 4.81 on WikiText2 and 3.46 on PTB, while also decreasing memory usage by 52.6%.
arXiv Detail & Related papers (2023-05-28T15:15:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.