LoRA: Low-Rank Adaptation of Large Language Models
- URL: http://arxiv.org/abs/2106.09685v1
- Date: Thu, 17 Jun 2021 17:37:18 GMT
- Title: LoRA: Low-Rank Adaptation of Large Language Models
- Authors: Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi
Li, Shean Wang, Weizhu Chen
- Abstract summary: Low-Rank Adaptation, or LoRA, freezes the pre-trained model weights and injects trainable rank decomposition into each layer of the Transformer architecture.
For GPT-3, LoRA can reduce the number of trainable parameters by 10,000 times and the computation hardware requirement by 3 times compared to full fine-tuning.
- Score: 71.75808607987281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The dominant paradigm of natural language processing consists of large-scale
pre-training on general domain data and adaptation to particular tasks or
domains. As we pre-train larger models, conventional fine-tuning, which
retrains all model parameters, becomes less feasible. Using GPT-3 175B as an
example, deploying many independent instances of fine-tuned models, each with
175B parameters, is extremely expensive. We propose Low-Rank Adaptation, or
LoRA, which freezes the pre-trained model weights and injects trainable rank
decomposition matrices into each layer of the Transformer architecture, greatly
reducing the number of trainable parameters for downstream tasks. For GPT-3,
LoRA can reduce the number of trainable parameters by 10,000 times and the
computation hardware requirement by 3 times compared to full fine-tuning. LoRA
performs on-par or better than fine-tuning in model quality on both GPT-3 and
GPT-2, despite having fewer trainable parameters, a higher training throughput,
and no additional inference latency. We also provide an empirical investigation
into rank-deficiency in language model adaptations, which sheds light on the
efficacy of LoRA. We release our implementation in GPT-2 at
https://github.com/microsoft/LoRA .
Related papers
- LoRA vs Full Fine-tuning: An Illusion of Equivalence [76.11938177294178]
We study how different fine-tuning methods change pre-trained models by analyzing the model's weight matrices through the lens of their spectral properties.
We find that full fine-tuning and LoRA yield weight matrices whose singular value decompositions exhibit very different structure.
We conclude by examining why intruder dimensions appear in LoRA fine-tuned models, why they are undesirable, and how their effects can be minimized.
arXiv Detail & Related papers (2024-10-28T17:14:01Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - LoRA$^2$ : Multi-Scale Low-Rank Approximations for Fine-Tuning Large Language Models [3.7049613588433497]
Low-Rank Adaptation (LoRA) significantly reduces the number of trainable parameters for fine-tuning.
We extend the LoRA to multiple scales, dubbed as LoRA$2$.
arXiv Detail & Related papers (2024-08-13T12:31:30Z) - BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models [13.660511750245245]
This work introduces Bias-Alleviating Low-Rank Adaptation (BA-LoRA), a novel PEFT method designed to counteract bias inheritance.
BA-LoRA incorporates three distinct regularization terms: (1) a consistency regularizer, (2) a diversity regularizer, and (3) a singular value decomposition regularizer.
The results demonstrate that BA-LoRA outperforms LoRA and its state-of-the-art variants.
arXiv Detail & Related papers (2024-08-08T16:13:26Z) - MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning [71.50432879573614]
Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional.
We present MELoRA, a mini-ensemble low-rank adapters that uses fewer trainable parameters while maintaining a higher rank.
Our experimental results show that, compared to LoRA, MELoRA achieves better performance with 8 times fewer trainable parameters on natural language understanding tasks and 36 times fewer trainable parameters on instruction following tasks.
arXiv Detail & Related papers (2024-02-27T07:14:12Z) - Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection [57.537583869961885]
Self-supervised speech models are a rapidly developing research topic in fake audio detection.
We apply low-rank adaptation(LoRA) to the wav2vec2 model, freezing the pre-trained model weights and injecting a trainable rank-decomposition matrix into each layer of the transformer architecture.
Compared with fine-tuning with Adam on the wav2vec2 model containing 317M training parameters, LoRA achieved similar performance by reducing the number of trainable parameters by 198 times.
arXiv Detail & Related papers (2023-06-09T01:43:41Z) - AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning [143.23123791557245]
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP.
We propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score.
We conduct extensive experiments with several pre-trained models on natural language processing, question answering, and natural language generation to validate the effectiveness of AdaLoRA.
arXiv Detail & Related papers (2023-03-18T22:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.