AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
- URL: http://arxiv.org/abs/2510.08034v1
- Date: Thu, 09 Oct 2025 10:13:16 GMT
- Title: AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
- Authors: Xiaoshuang Ji, Zhendong Zhao, Xiaoyan Gu, Xiaojun Chen, Xin Zhao, Zeyao Liu,
- Abstract summary: Low-Rank Adaptation (LoRA) has emerged as one of the most widely adopted approaches.<n>LoRA is typically applied to the $WQ$ and $WV$ projection matrices of self-attention modules.<n>We introduce textAILoRA, a novel parameter-efficient method that incorporates function-aware asymmetric low-rank priors.
- Score: 11.663809872664105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parameter-efficient finetuning (PEFT) aims to mitigate the substantial computational and memory overhead involved in adapting large-scale pretrained models to diverse downstream tasks. Among numerous PEFT strategies, Low-Rank Adaptation (LoRA) has emerged as one of the most widely adopted approaches due to its robust empirical performance and low implementation complexity. In practical deployment, LoRA is typically applied to the $W^Q$ and $W^V$ projection matrices of self-attention modules, enabling an effective trade-off between model performance and parameter efficiency. While LoRA has achieved considerable empirical success, it still encounters challenges such as suboptimal performance and slow convergence. To address these limitations, we introduce \textbf{AILoRA}, a novel parameter-efficient method that incorporates function-aware asymmetric low-rank priors. Our empirical analysis reveals that the projection matrices $W^Q$ and $W^V$ in the self-attention mechanism exhibit distinct parameter characteristics, stemming from their functional differences. Specifically, $W^Q$ captures task-specific semantic space knowledge essential for attention distributions computation, making its parameters highly sensitive to downstream task variations. In contrast, $W^V$ encodes token-level feature representations that tend to remain stable across tasks and layers. Leveraging these insights, AILoRA performs a function-aware initialization by injecting the principal components of $W^Q$ to retain task-adaptive capacity, and the minor components of $W^V$ to preserve generalizable feature representations. This asymmetric initialization strategy enables LoRA modules to better capture the specialized roles of attention parameters, thereby enhancing both finetuning performance and convergence efficiency.
Related papers
- High-Rank Structured Modulation for Parameter-Efficient Fine-Tuning [57.85676271833619]
Low-rank Adaptation (LoRA) uses a low-rank update method to simulate full parameter fine-tuning.<n>We present textbfSMoA, a high-rank textbfStructured textbfMOdulation textbfAdapter that uses fewer trainable parameters while maintaining a higher rank.
arXiv Detail & Related papers (2026-01-12T13:06:17Z) - Lighter-X: An Efficient and Plug-and-play Strategy for Graph-based Recommendation through Decoupled Propagation [49.865020394064096]
We propose textbfLighter-X, an efficient and modular framework that can be seamlessly integrated with existing GNN-based recommender architectures.<n>Our approach substantially reduces both parameter size and computational complexity while preserving the theoretical guarantees and empirical performance of the base models.<n>Experiments demonstrate that Lighter-X achieves comparable performance to baseline models with significantly fewer parameters.
arXiv Detail & Related papers (2025-10-11T08:33:08Z) - MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation [28.079735905482096]
Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>Low-Rank Adaptation (LoRA) has emerged as a dominant method in.<n>
arXiv Detail & Related papers (2025-10-07T15:06:46Z) - Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning [16.99490636203893]
We present textscRavan, an adaptive multi-head LoRA method that balances parameter efficiency and model expressivity.<n>Experiments on vision and language benchmarks show that textscRavan improves test accuracy by 2-8% over prior parameter-efficient baselines.
arXiv Detail & Related papers (2025-06-05T20:28:02Z) - GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuning [2.7446241148152253]
Fine-tuning large language models (LLMs) is computationally intensive because it requires updating all parameters.<n>Low-Rank Adaptation (LoRA) improves efficiency by modifying only a subset of weights but introduces a trade-off between expressivity and computational cost.<n>We propose Geometric Low-Rank Adaptation (GeLoRA), a novel framework that computes the intrinsic dimensionality of hidden state representations to adaptively select LoRA ranks.
arXiv Detail & Related papers (2024-12-12T13:04:54Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.<n>Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.<n>We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.<n>Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models [18.877891285367216]
We introduce $textID3$, a novel selective PEFT method that calculates parameter importance continually.<n>We analytically show that $textID3$ reduces the number of gradient updates by a factor of two, enhancing computational efficiency.
arXiv Detail & Related papers (2024-08-26T17:58:53Z) - DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution [28.589498108609202]
Low-Rank Adaptation (LoRA) relies on a bypass framework that ignores the differential parameter budget requirements across weight matrices.
DoRA decomposes high-rank LoRA layers into structured single-rank components, allowing for dynamic pruning of parameter budget.
Experimental results demonstrate that DoRA can achieve competitive performance compared with LoRA and full model fine-tuning.
arXiv Detail & Related papers (2024-05-27T17:02:27Z) - Asymmetry in Low-Rank Adapters of Foundation Models [47.310550805920585]
This paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices.
We show that fine-tuning $B$ is inherently more effective than fine-tuning $A$, and that a random untrained $A$ should perform nearly as well as a fine-tuned one.
arXiv Detail & Related papers (2024-02-26T18:59:12Z) - AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning [143.23123791557245]
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP.
We propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score.
We conduct extensive experiments with several pre-trained models on natural language processing, question answering, and natural language generation to validate the effectiveness of AdaLoRA.
arXiv Detail & Related papers (2023-03-18T22:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.