GoRA: Gradient-driven Adaptive Low Rank Adaptation
- URL: http://arxiv.org/abs/2502.12171v2
- Date: Thu, 05 Jun 2025 14:16:46 GMT
- Title: GoRA: Gradient-driven Adaptive Low Rank Adaptation
- Authors: Haonan He, Peng Ye, Yuchen Ren, Yuan Yuan, Luyang Zhou, Shucun Ju, Lei Chen,
- Abstract summary: Low-Rank Adaptation (LoRA) is a crucial method for efficiently fine-tuning large language models.<n>In this paper, we propose a novel framework -- GoRA (Gradient-driven Adaptive Low Rank Adaptation) -- that simultaneously adapts both the rank and initialization strategy.<n>GoRA consistently outperforms existing LoRA-based methods while preserving the efficiency of vanilla LoRA.
- Score: 10.727670204249932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-Rank Adaptation (LoRA) is a crucial method for efficiently fine-tuning large language models (LLMs), with its effectiveness influenced by two key factors: rank selection and weight initialization. While numerous LoRA variants have been proposed to improve performance by addressing one of these aspects, they often compromise usability or computational efficiency. In this paper, we analyze and identify the core limitations of existing approaches and propose a novel framework -- GoRA (Gradient-driven Adaptive Low Rank Adaptation) -- that simultaneously adapts both the rank and initialization strategy within a unified framework. GoRA leverages gradient information during training to dynamically assign optimal ranks and initialize low-rank adapter weights in an adaptive manner. To our knowledge, GoRA is the first method that not only addresses the limitations of prior approaches -- which often focus on either rank selection or initialization in isolation -- but also unifies both aspects within a single framework, enabling more effective and efficient adaptation. Extensive experiments across various architectures and modalities show that GoRA consistently outperforms existing LoRA-based methods while preserving the efficiency of vanilla LoRA. For example, when fine-tuning Llama3.1-8B-Base for mathematical reasoning, GoRA achieves a 5.13-point improvement over standard LoRA and even outperforms full fine-tuning by 2.05 points under high-rank settings.
Related papers
- Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes [9.4848188271008]
Low-Rank Adaptation (LoRA) has proven effective in reducing computational costs while maintaining performance comparable to fully fine-tuned foundation models.<n>Current adaptive LoRA methods attempt to overcome this limitation by dynamically expanding or selectively allocating ranks.<n>We introduce Stable Rank-Guided Low-Rank Adaptation (SR-LoRA), a novel framework that utilizes the stable rank of pre-trained weight matrices as a natural prior for layer-wise rank allocation.
arXiv Detail & Related papers (2025-06-30T23:54:23Z) - ARD-LoRA: Dynamic Rank Allocation for Parameter-Efficient Fine-Tuning of Foundation Models with Heterogeneous Adaptation Needs [0.46040036610482665]
This paper introduces Adaptive Rank Dynamic LoRA (ARD-LoRA), a novel framework that automates rank allocation through learnable scaling factors.<n>ARD-LoRA enables continuous, differentiable, per-head rank adaptation.<n> Experiments on LLAMA-3.1-70B and PaliGemma-2 demonstrate ARD-LoRA's efficacy, achieving up to 99.3% of full fine-tuning performance with only 0.32% trainable parameters.
arXiv Detail & Related papers (2025-06-23T03:45:37Z) - ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning [6.657174308208715]
ElaLoRA is an adaptive low-rank adaptation framework that dynamically prunes and expands ranks based on gradient-derived importance scores.<n>ElaLoRA consistently outperforms existing PEFT methods across different parameter budgets.<n>By introducing a principled and adaptive rank allocation mechanism, ElaLoRA offers a scalable and efficient fine-tuning solution.
arXiv Detail & Related papers (2025-03-31T21:58:25Z) - LoRA-GGPO: Mitigating Double Descent in LoRA Fine-Tuning via Gradient-Guided Perturbation Optimization [12.504723188498]
Large Language Models (LLMs) have achieved remarkable success in natural language processing.
Low-Rank Adaptation (LoRA) has emerged as a practical solution by approximating parameter updates with low-rank matrices.
LoRA-GGPO is a novel method that leverages gradient and weight norms to generate targeted perturbations.
arXiv Detail & Related papers (2025-02-20T13:14:41Z) - BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods.
We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z) - RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation [59.34193580856381]
Low-Rank Adaptation (LoRA) is widely used and effective for fine-tuning large language models.<n>We propose RoRA (Rank-adaptive Reliability Optimization), a simple yet effective method for optimizing LoRA's scaling factor.<n>RoRA ensures improved performance as rank size increases and excels in the more challenging task of accuracy recovery when fine-tuning pruned models.
arXiv Detail & Related papers (2025-01-08T07:13:52Z) - LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization [78.93425154518705]
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements.
This paper introduces LoRA-RITE, a novel adaptive matrix preconditioning method for LoRA optimization.
arXiv Detail & Related papers (2024-10-27T22:57:12Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - LoRA-Pro: Are Low-Rank Adapters Properly Optimized? [121.0693322732454]
Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models.
Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning.
We introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of low-rank matrices.
arXiv Detail & Related papers (2024-07-25T17:57:12Z) - LoRA-GA: Low-Rank Adaptation with Gradient Approximation [5.685201910521295]
Fine-tuning large-scale pretrained models is prohibitively expensive in terms of computational and memory costs.
LoRA offers a cost-effective alternative by fine-tuning an auxiliary low-rank model that has significantly fewer parameters.
LoRA converges at a considerably slower rate compared to full fine-tuning, leading to increased overall compute and often worse test performance.
arXiv Detail & Related papers (2024-07-06T08:37:21Z) - ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models [8.251547772610301]
We extend the methodology of low-rank adaptation (LoRA) to an innovative approach we call allocating low-rank adaptation (ALoRA)
First, we propose a novel method, AB-LoRA, that can effectively estimate the importance score of each LoRA rank.
Second, guided by AB-LoRA, we gradually prune abundant and negatively impacting LoRA ranks and allocate the pruned LoRA budgets to important Transformer modules needing higher ranks.
arXiv Detail & Related papers (2024-03-24T15:09:55Z) - Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.
Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters.
Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.