Rethinking Parameter Sharing for LLM Fine-Tuning with Multiple LoRAs
- URL: http://arxiv.org/abs/2509.25414v1
- Date: Mon, 29 Sep 2025 19:16:14 GMT
- Title: Rethinking Parameter Sharing for LLM Fine-Tuning with Multiple LoRAs
- Authors: Hao Ban, Kaiyi Ji,
- Abstract summary: We propose an asymmetric multi-LoRA design with multiple $A$ matrices and a single shared $B$ in multi-task fine-tuning.<n>Our methods achieve more balanced performance across tasks with comparable or superior average accuracy relative to existing multi-LoRA approaches.
- Score: 26.212332132619736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models are often adapted using parameter-efficient techniques such as Low-Rank Adaptation (LoRA), formulated as $y = W_0x + BAx$, where $W_0$ is the pre-trained parameters and $x$ is the input to the adapted layer. While multi-adapter extensions often employ multiple LoRAs, prior studies suggest that the inner $A$ matrices are highly similar during training and thus suitable for sharing. We revisit this phenomenon and find that this similarity is largely attributable to the identical initialization rather than shared knowledge, with $B$ playing a more critical role in knowledge encoding and transfer. Motivated by these insights, we propose \textbf{ALoRA}, an asymmetric multi-LoRA design with multiple $A$ matrices and a single shared $B$ in multi-task fine-tuning, and \textbf{Fed-ALoRA}, which shares $B$ across clients in federated fine-tuning under both homogeneous and heterogeneous settings, through a novel matrix decomposition strategy to accommodate heterogeneous ranks across clients. Experiments on commonsense reasoning, math reasoning, multi-task NLP dataset, and federated NLP dataset demonstrate that our methods achieve more balanced performance across tasks with comparable or superior average accuracy relative to existing multi-LoRA approaches. Codes are available at https://github.com/OptMN-Lab/ALoRA.
Related papers
- MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation [28.079735905482096]
Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>
arXiv Detail & Related papers (2025-10-07T15:06:46Z) - QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation [52.024845354511555]
We propose QR-LoRA, a novel fine-tuning framework leveraging QR decomposition for structured parameter updates.<n>Our key insight is that the Q matrix naturally minimizes interference between different visual features.<n>Experiments demonstrate that QR-LoRA achieves superior disentanglement in content-style fusion tasks.
arXiv Detail & Related papers (2025-07-07T01:31:01Z) - Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning [16.99490636203893]
We present textscRavan, an adaptive multi-head LoRA method that balances parameter efficiency and model expressivity.<n>Experiments on vision and language benchmarks show that textscRavan improves test accuracy by 2-8% over prior parameter-efficient baselines.
arXiv Detail & Related papers (2025-06-05T20:28:02Z) - WeightLoRA: Keep Only Necessary Adapters [76.32368157312477]
Low-rank adaptation ($texttLoRA$) adds trainable adapters to selected layers.<n>We propose a novel method, $textttWeightLoRA$, which overcomes this issue by adaptive selection of the most critical $textttLoRA$ heads.<n>We conduct experiments for a series of competitive benchmarks and DeBERTa, BART, and Llama models, comparing our method with different adaptive approaches.
arXiv Detail & Related papers (2025-06-03T10:33:16Z) - FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA [61.79405341803085]
Low-Rank Adaptation (LoRA) is widely used for efficient fine-tuning of language models in federated learning (FL)<n>Low-Rank Adaptation (LoRA) is widely used for efficient fine-tuning of language models in federated learning (FL)
arXiv Detail & Related papers (2025-05-19T07:32:56Z) - R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-Task Learning [12.431575579432458]
Low-Rank Adaptation (LoRA) provides a cost-effective solution by approximating weight updates through low-rank matrices.<n>To enhance LoRA's capability in multi-task learning, we propose R-LoRA, which incorporates Multi-Head Randomization.
arXiv Detail & Related papers (2025-02-21T13:30:21Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - Selective Aggregation for Low-Rank Adaptation in Federated Learning [10.683530421910028]
We introduce Federated Share-A Low-Rank Adaptation (FedSA-LoRA), which employs two low-rank trainable matrices $A$ and $B$ to model the weight update.<n>We extend our FedSA-LoRA method to these LoRA variants, resulting in FedSA-rsLoRA and FedSA-VeRA.
arXiv Detail & Related papers (2024-10-02T12:14:36Z) - A Single Linear Layer Yields Task-Adapted Low-Rank Matrices [4.695004706877747]
Low-Rank Adaptation (LoRA) is a widely used Efficient Fine-Tuning (PEFT) method that updates an initial weight matrix $W_0$ with a delta matrix $Delta W$.
We show that CondLoRA maintains a performance on par with LoRA, despite the fact that the trainable parameters of CondLoRA are fewer than those of LoRA.
arXiv Detail & Related papers (2024-03-22T04:38:42Z) - Asymmetry in Low-Rank Adapters of Foundation Models [47.310550805920585]
This paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices.
We show that fine-tuning $B$ is inherently more effective than fine-tuning $A$, and that a random untrained $A$ should perform nearly as well as a fine-tuned one.
arXiv Detail & Related papers (2024-02-26T18:59:12Z) - Multimodal Instruction Tuning with Conditional Mixture of LoRA [51.58020580970644]
This paper introduces a novel approach that integrates multimodal instruction tuning with Low-Rank Adaption (LoRA)<n>It innovates upon LoRA by dynamically constructing low-rank adaptation matrices tailored to the unique demands of each input instance.<n> Experimental results on various multimodal evaluation datasets indicate that MixLoRA not only outperforms the conventional LoRA with the same or even higher ranks.
arXiv Detail & Related papers (2024-02-24T20:15:31Z) - FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the
Power of Heterogeneous Clients [50.13097183691517]
In real-world federated scenarios, there often exist a multitude of heterogeneous clients with varying computation and communication resources.
We propose a novel federated tuning algorithm, FedRA.
In each communication round, FedRA randomly generates an allocation matrix.
It reorganizes a small number of layers from the original model based on the allocation matrix and fine-tunes using adapters.
arXiv Detail & Related papers (2023-11-19T04:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.