Related papers: SBoRA: Low-Rank Adaptation with Regional Weight Updates

SBoRA: Low-Rank Adaptation with Regional Weight Updates

URL: http://arxiv.org/abs/2407.05413v3
Date: Wed, 9 Oct 2024 07:53:10 GMT
Title: SBoRA: Low-Rank Adaptation with Regional Weight Updates
Authors: Lai-Man Po, Yuyang Liu, Haoxuan Wu, Tianqi Zhang, Wing-Yin Yu, Zhuohan Wang, Zeyu Jiang, Kun Li,
Abstract summary: This paper introduces Standard Basis LoRA (SBoRA), a novel parameter-efficient fine-tuning approach for Large Language Models. SBoRA reduces the number of trainable parameters by half or doubles the rank with the similar number of trainable parameters as LoRA. Our results demonstrate the superiority of SBoRA-FA over LoRA in various fine-tuning tasks, including commonsense reasoning and arithmetic reasoning.
Score: 19.15481369459963
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces Standard Basis LoRA (SBoRA), a novel parameter-efficient fine-tuning approach for Large Language Models that builds upon the pioneering works of Low-Rank Adaptation (LoRA) and Orthogonal Adaptation. SBoRA reduces the number of trainable parameters by half or doubles the rank with the similar number of trainable parameters as LoRA, while improving learning performance. By utilizing orthogonal standard basis vectors to initialize one of the low-rank matrices (either $\mathbf{A}$ or $\mathbf{B}$), SBoRA facilitates regional weight updates and memory-efficient fine-tuning. This results in two variants, SBoRA-FA and SBoRA-FB, where only one of the matrices is updated, leading to a sparse update matrix $\mathrm{\Delta} \mathbf{W}$ with predominantly zero rows or columns. Consequently, most of the fine-tuned model's weights $(\mathbf{W}_0+\mathrm{\Delta} \mathbf{W})$ remain unchanged from the pre-trained weights, akin to the modular organization of the human brain, which efficiently adapts to new tasks. Our empirical results demonstrate the superiority of SBoRA-FA over LoRA in various fine-tuning tasks, including commonsense reasoning and arithmetic reasoning. Furthermore, we evaluate the effectiveness of QSBoRA on quantized LLaMA models of varying scales, highlighting its potential for efficient adaptation to new tasks. Code is available at https://github.com/cityuhkai/SBoRA

Related papers

RaSA: Rank-Sharing Low-Rank Adaptation [67.40422142257091]
Low-rank adaptation (LoRA) has been prominently employed for parameter-efficient fine-tuning of large language models (LLMs) We introduce Rank-Sharing Low-Rank Adaptation (RaSA), an innovative extension that enhances the expressive capacity of LoRA by leveraging partial rank sharing across layers. Our theoretically grounded and empirically validated approach demonstrates that RaSA not only maintains the core advantages of LoRA but also significantly boosts performance in challenging code and math tasks.
arXiv Detail & Related papers (2025-03-16T17:16:36Z)
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization [78.93425154518705]
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements. This paper introduces LoRA-RITE, a novel adaptive matrix preconditioning method for LoRA optimization.
arXiv Detail & Related papers (2024-10-27T22:57:12Z)
CoRA: Optimizing Low-Rank Adaptation with Common Subspace of Large Language Models [7.108651381160281]
Low-Rank Adaptation (LoRA) strategy balances efficiency and performance in fine-tuning large models. We propose textbfCoRA: leveraging shared knowledge to optimize LoRA training by substituting its matrix $B$ with a common subspace from large models. Our experiments show that the first approach achieves the same efficacy as the original LoRA fine-tuning while being more efficient than halving parameters.
arXiv Detail & Related papers (2024-08-31T12:48:27Z)
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning [105.11844150736536]
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. We propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters. Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.
arXiv Detail & Related papers (2024-05-20T15:48:32Z)
ResLoRA: Identity Residual Mapping in Low-Rank Adaption [96.59370314485074]
We propose ResLoRA, an improved framework of low-rank adaptation (LoRA) Our method can achieve better results in fewer training steps without any extra trainable parameters or inference cost compared to LoRA. The experiments on NLG, NLU, and text-to-image tasks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-02-28T04:33:20Z)
DoRA: Weight-Decomposed Low-Rank Adaptation [57.68678247436207]
We introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA. Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA) DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning.
arXiv Detail & Related papers (2024-02-14T17:59:34Z)
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models [45.72323731094864]
Low-Rank Adaptation (LoRA) emerges as a popular parameter-efficient fine-tuning (PEFT) method. In this work, we study the enhancement of LoRA training by introducing an $r times r$ preconditioner in each gradient step.
arXiv Detail & Related papers (2024-02-04T05:05:43Z)
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm. We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z)
NOLA: Compressing LoRA using Linear Combination of Random Basis [22.76088132446952]
We introduce NOLA, which overcomes the rank one lower bound present in LoRA. NOLA performs as well as LoRA models with much fewer number of parameters compared to LoRA with rank one, the best compression LoRA can archive.
arXiv Detail & Related papers (2023-10-04T03:30:24Z)
Delta-LoRA: Fine-Tuning High-Rank Parameters with the Delta of Low-Rank Matrices [27.693028578653394]
Delta-LoRA is a novel parameter-efficient approach to fine-tune large language models (LLMs) In contrast to LoRA and other low-rank adaptation methods such as AdaLoRA, Delta-LoRA not only updates the low-rank matrices $bA$ and $bB$, but also propagate the learning to the pre-trained weights $bW$.
arXiv Detail & Related papers (2023-09-05T17:40:34Z)
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning [143.23123791557245]
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. We propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score. We conduct extensive experiments with several pre-trained models on natural language processing, question answering, and natural language generation to validate the effectiveness of AdaLoRA.
arXiv Detail & Related papers (2023-03-18T22:36:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.