Accurate and Efficient Low-Rank Model Merging in Core Space
- URL: http://arxiv.org/abs/2509.17786v3
- Date: Mon, 20 Oct 2025 10:33:14 GMT
- Title: Accurate and Efficient Low-Rank Model Merging in Core Space
- Authors: Aniello Panariello, Daniel Marczak, Simone Magistri, Angelo Porrello, Bartłomiej Twardowski, Andrew D. Bagdanov, Simone Calderara, Joost van de Weijer,
- Abstract summary: Core Space merging framework enables the merging of LoRA-adapted models within a common alignment basis.<n>We show that Core Space significantly improves existing merging techniques and achieves state-of-the-art results on both vision and language tasks.
- Score: 39.05680982515462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the challenges associated with merging low-rank adaptations of large neural networks. With the rise of parameter-efficient adaptation techniques, such as Low-Rank Adaptation (LoRA), model fine-tuning has become more accessible. While fine-tuning models with LoRA is highly efficient, existing merging methods often sacrifice this efficiency by merging fully-sized weight matrices. We propose the Core Space merging framework, which enables the merging of LoRA-adapted models within a common alignment basis, thereby preserving the efficiency of low-rank adaptation while substantially improving accuracy across tasks. We further provide a formal proof that projection into Core Space ensures no loss of information and provide a complexity analysis showing the efficiency gains. Extensive empirical results demonstrate that Core Space significantly improves existing merging techniques and achieves state-of-the-art results on both vision and language tasks while utilizing a fraction of the computational resources. Codebase is available at https://github.com/apanariello4/core-space-merging.
Related papers
- Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations [55.047454145941366]
Streaming Merging is an innovative model updating paradigm that conceptualizes merging as an iterative optimization process.<n> ARM is a strategy designed to approximate gradient descent dynamics.<n> ARM requires only early SFT checkpoints and, through iterative merging, surpasses the fully converged SFT model.
arXiv Detail & Related papers (2026-02-03T08:15:57Z) - Deep Hierarchical Learning with Nested Subspace Networks [53.71337604556311]
We propose Nested Subspace Networks (NSNs) for large neural networks.<n>NSNs enable a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets.<n>We show that NSNs can be surgically applied to pre-trained LLMs and unlock a smooth and predictable compute-performance frontier.
arXiv Detail & Related papers (2025-09-22T15:13:14Z) - AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption [3.805501490912696]
Federated fine-tuning has emerged as a promising approach to adapt foundation models to downstream tasks using decentralized data.<n>We propose AFLoRA, an adaptive and lightweight federated fine-tuning framework for Large Language Models.
arXiv Detail & Related papers (2025-05-30T16:35:32Z) - Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking [17.095655627061934]
We present Decom-Renorm-Merge (DRM), a simple yet effective approach that leverages Singular Value Decomposition to decompose and coordinate weight matrices into an aligned joint space.<n>Our experimental results show that DRM outperforms several state-of-the-art merging techniques across full finetuning and low-rank adaptation settings.
arXiv Detail & Related papers (2025-05-29T05:37:53Z) - Reinforced Model Merging [53.84354455400038]
We present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks.<n>By utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times.
arXiv Detail & Related papers (2025-03-27T08:52:41Z) - LightGNN: Simple Graph Neural Network for Recommendation [14.514770044236375]
Graph neural networks (GNNs) have demonstrated superior performance in collaborative recommendation.<n>Existing GNN paradigms face challenges in scalability and robustness when handling large-scale, noisy, and real-world datasets.<n>We present LightGNN, a lightweight and distillation-based GNN pruning framework.
arXiv Detail & Related papers (2025-01-06T18:59:55Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - Balancing LoRA Performance and Efficiency with Simple Shard Sharing [8.827921242078883]
textbfOptimal textbfShard textbfSharing textbfIntegration in textbfLoRA, a novel PEFT approach that addresses this trade-off through a simple shard-sharing mechanism.<n>Fossils significantly outperforms standard LoRA and its prominent variants in both model performance metrics and computational efficiency.
arXiv Detail & Related papers (2024-09-19T10:26:42Z) - Soft Merging: A Flexible and Robust Soft Model Merging Approach for
Enhanced Neural Network Performance [6.599368083393398]
Gradient (SGD) is often limited to converging local optima to improve model performance.
em soft merging method minimizes the obtained local optima models in undesirable results.
Experiments underscore the effectiveness of the merged networks.
arXiv Detail & Related papers (2023-09-21T17:07:31Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Momentum Contrastive Autoencoder: Using Contrastive Learning for Latent
Space Distribution Matching in WAE [51.09507030387935]
Wasserstein autoencoder (WAE) shows that matching two distributions is equivalent to minimizing a simple autoencoder (AE) loss under the constraint that the latent space of this AE matches a pre-specified prior distribution.
We propose to use the contrastive learning framework that has been shown to be effective for self-supervised representation learning, as a means to resolve this problem.
We show that using the contrastive learning framework to optimize the WAE loss achieves faster convergence and more stable optimization compared with existing popular algorithms for WAE.
arXiv Detail & Related papers (2021-10-19T22:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.