Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation
- URL: http://arxiv.org/abs/2510.03731v1
- Date: Sat, 04 Oct 2025 08:34:06 GMT
- Title: Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation
- Authors: Yongfu Xue,
- Abstract summary: LoRA has gained widespread popularity due to its strong balance of effectiveness and parameter efficiency.<n>LoRA relies on initializing two low-rank matrices whose product is zero.<n>IniLoRA achieves better performance than LoRA across a range of models and tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid development of parameter-efficient fine-tuning methods has noticeably improved the efficiency of adapting large language models. Among these, LoRA has gained widespread popularity due to its strong balance of effectiveness and parameter efficiency. However, LoRA relies on initializing two low-rank matrices whose product is zero, which limits its ability to effectively activate and leverage the original model weights-creating a potential bottleneck for optimal performance. To address this limitation, we propose \textbf{IniLoRA}, a novel initialization strategy that initializes the low-rank matrices to closely approximate the original model weights. Experimental results indicate that IniLoRA achieves better performance than LoRA across a range of models and tasks. Additionally, we introduce two variants, IniLoRA-$\alpha$ and IniLoRA-$\beta$, both leveraging distinct initialization methods to enhance performance further.
Related papers
- Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation [85.89510825889168]
We introduce LoRA-Pre, a novel low-rank system for efficient pre-training.<n>LoRA-Pre decomposing the momentum matrix into a compact low-rank subspace within the online linear learner.<n>We empirically validate LoRA-Pre's efficacy by pre-training models from the Llama architecture family.
arXiv Detail & Related papers (2026-02-27T18:57:06Z) - Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning [48.66442009036754]
Low-Rank Adaptation (LoRA) is the prevailing approach for efficient large language model fine-tuning.<n>In this work, we re-evaluate four representative LoRA variants alongside vanilla LoRA.<n>We find that different LoRA methods favor distinct learning rate ranges.
arXiv Detail & Related papers (2026-02-04T19:36:20Z) - LoRA Is Slower Than You Think [0.0]
Low-Rank Adaptation (LoRA) is one of the most widely used techniques for fine-tuning large language models (LLMs)<n>By introducing a small number of trainable low-rank weight matrices, LoRA substantially reduces the number of parameters that need to be updated.<n>We observed that LoRA does not consistently provide speed improvements across all model architectures and training setups.
arXiv Detail & Related papers (2025-07-06T08:36:43Z) - BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods.<n>We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z) - GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuning [2.7446241148152253]
Fine-tuning large language models (LLMs) is computationally intensive because it requires updating all parameters.<n>Low-Rank Adaptation (LoRA) improves efficiency by modifying only a subset of weights but introduces a trade-off between expressivity and computational cost.<n>We propose Geometric Low-Rank Adaptation (GeLoRA), a novel framework that computes the intrinsic dimensionality of hidden state representations to adaptively select LoRA ranks.
arXiv Detail & Related papers (2024-12-12T13:04:54Z) - LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization [78.93425154518705]
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements.<n>This paper introduces LoRA-RITE, a novel adaptive matrix preconditioning method for LoRA optimization.
arXiv Detail & Related papers (2024-10-27T22:57:12Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape [52.98187034726091]
We introduce Flat-LoRA, which aims to identify a low-rank adaptation situated in a flat region of the full parameter space.<n>We show that Flat-LoRA improves both in-domain and out-of-domain generalization.
arXiv Detail & Related papers (2024-09-22T11:24:10Z) - CoRA: Optimizing Low-Rank Adaptation with Common Subspace of Large Language Models [7.108651381160281]
Low-Rank Adaptation (LoRA) strategy balances efficiency and performance in fine-tuning large models.
We propose textbfCoRA: leveraging shared knowledge to optimize LoRA training by substituting its matrix $B$ with a common subspace from large models.
Our experiments show that the first approach achieves the same efficacy as the original LoRA fine-tuning while being more efficient than halving parameters.
arXiv Detail & Related papers (2024-08-31T12:48:27Z) - LoRA-Pro: Are Low-Rank Adapters Properly Optimized? [121.0693322732454]
Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models.<n>Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning.<n>We introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of low-rank matrices.
arXiv Detail & Related papers (2024-07-25T17:57:12Z) - Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach [10.980433187379868]
Low-Rank Adaptation (LoRA) has been developed to reduce resource consumption while maintaining satisfactory fine-tuning results.
This paper investigates the intrinsic dimension of the matrix updates approximated by the LoRA method and reveals the performance benefits of increasing this intrinsic dimension.
arXiv Detail & Related papers (2024-07-16T15:26:31Z) - Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm.
We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.