RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models
- URL: http://arxiv.org/abs/2505.18877v1
- Date: Sat, 24 May 2025 21:33:16 GMT
- Title: RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models
- Authors: Yilang Zhang, Bingcong Li, Georgios B. Giannakis,
- Abstract summary: Low-Rank Adaptation (LoRA) lowers the computational and memory overhead of fine-tuning large models by updating a low-dimensional subspace of the pre-trained weight matrix.<n>This article identifies the optimal low-rank factorization per step that minimizes an upper bound on the loss.<n>The resultanted low-rank adaptation (RefLoRA) method promotes a flatter loss landscape, along with consistent and balanced weight updates.
- Score: 39.656014609027494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-Rank Adaptation (LoRA) lowers the computational and memory overhead of fine-tuning large models by updating a low-dimensional subspace of the pre-trained weight matrix. Albeit efficient, LoRA exhibits suboptimal convergence and noticeable performance degradation, due to inconsistent and imbalanced weight updates induced by its nonunique low-rank factorizations. To overcome these limitations, this article identifies the optimal low-rank factorization per step that minimizes an upper bound on the loss. The resultant refactored low-rank adaptation (RefLoRA) method promotes a flatter loss landscape, along with consistent and balanced weight updates, thus speeding up stable convergence. Extensive experiments evaluate RefLoRA on natural language understanding, and commonsense reasoning tasks with popular large language models including DeBERTaV3, LLaMA-7B, LLaMA2-7B and LLaMA3-8B. The numerical tests corroborate that RefLoRA converges faster, outperforms various benchmarks, and enjoys negligible computational overhead compared to state-of-the-art LoRA variants.
Related papers
- LoRA Is Slower Than You Think [0.0]
Low-Rank Adaptation (LoRA) is one of the most widely used techniques for fine-tuning large language models (LLMs)<n>By introducing a small number of trainable low-rank weight matrices, LoRA substantially reduces the number of parameters that need to be updated.<n>We observed that LoRA does not consistently provide speed improvements across all model architectures and training setups.
arXiv Detail & Related papers (2025-07-06T08:36:43Z) - Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes [9.4848188271008]
Low-Rank Adaptation (LoRA) has proven effective in reducing computational costs while maintaining performance comparable to fully fine-tuned foundation models.<n>Current adaptive LoRA methods attempt to overcome this limitation by dynamically expanding or selectively allocating ranks.<n>We introduce Stable Rank-Guided Low-Rank Adaptation (SR-LoRA), a novel framework that utilizes the stable rank of pre-trained weight matrices as a natural prior for layer-wise rank allocation.
arXiv Detail & Related papers (2025-06-30T23:54:23Z) - SRLoRA: Subspace Recomposition in Low-Rank Adaptation via Importance-Based Fusion and Reinitialization [2.594346658179846]
Low-Rank Adaptation (LoRA) constrains updates to a fixed low-rank subspace.<n>We introduce Subspace Recomposition in Low-Rank Adaptation (SRLoRA) via importance-based fusion and reinitialization.<n> SRLoRA consistently achieves faster convergence and improved accuracy over standard LoRA.
arXiv Detail & Related papers (2025-05-18T14:12:40Z) - BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods.<n>We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z) - GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuning [2.7446241148152253]
Fine-tuning large language models (LLMs) is computationally intensive because it requires updating all parameters.<n>Low-Rank Adaptation (LoRA) improves efficiency by modifying only a subset of weights but introduces a trade-off between expressivity and computational cost.<n>We propose Geometric Low-Rank Adaptation (GeLoRA), a novel framework that computes the intrinsic dimensionality of hidden state representations to adaptively select LoRA ranks.
arXiv Detail & Related papers (2024-12-12T13:04:54Z) - LoRA vs Full Fine-tuning: An Illusion of Equivalence [76.11938177294178]
We study how Low-Rank Adaptation (LoRA) and full-finetuning change pre-trained models.<n>We find that LoRA and full fine-tuning yield weight matrices whose singular value decompositions exhibit very different structure.<n>We extend the finding that LoRA forgets less than full fine-tuning and find its forgetting is vastly localized to the intruder dimension.
arXiv Detail & Related papers (2024-10-28T17:14:01Z) - LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization [78.93425154518705]
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements.
This paper introduces LoRA-RITE, a novel adaptive matrix preconditioning method for LoRA optimization.
arXiv Detail & Related papers (2024-10-27T22:57:12Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape [52.98187034726091]
We introduce Flat-LoRA, which aims to identify a low-rank adaptation situated in a flat region of the full parameter space.<n>We show that Flat-LoRA improves both in-domain and out-of-domain generalization.
arXiv Detail & Related papers (2024-09-22T11:24:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.