CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging
- URL: http://arxiv.org/abs/2603.00573v1
- Date: Sat, 28 Feb 2026 09:40:11 GMT
- Title: CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging
- Authors: Jie Cao, Zhenxuan Fan, Zhuonan Wang, Tianwei Lin, Ziyuan Zhao, Rolan Yan, Wenqiao Zhang, Feifei Shao, Hongwei Wang, Jun Xiao, Siliang Tang,
- Abstract summary: Core Space Mixture of LoRA (bfCoMoL) is a novel MoE-LoRA framework that incorporates expert diversity, parameter efficiency, and fine-grained adaptation.<n>CoMoL consistently outperforms existing methods across multiple tasks.
- Score: 49.87105462292961
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) achieve remarkable performance on diverse downstream and domain-specific tasks via parameter-efficient fine-tuning (PEFT). However, existing PEFT methods, particularly MoE-LoRA architectures, suffer from limited parameter efficiency and coarse-grained adaptation due to the proliferation of LoRA experts and instance-level routing. To address these issues, we propose Core Space Mixture of LoRA (\textbf{CoMoL}), a novel MoE-LoRA framework that incorporates expert diversity, parameter efficiency, and fine-grained adaptation. Specifically, CoMoL introduces two key components: core space experts and core space routing. Core space experts store each expert in a compact core matrix, preserving diversity while controlling parameter growth. Core space routing dynamically selects and activates the appropriate core experts for each token, enabling fine-grained, input-adaptive routing. Activated core experts are then merged via a soft-merging strategy into a single core expert, which is combined with a shared LoRA to form a specialized LoRA module. Besides, the routing network is projected into the same low-rank space as the LoRA matrices, further reducing parameter overhead without compromising expressiveness. Extensive experiments demonstrate that CoMoL retains the adaptability of MoE-LoRA architectures while achieving parameter efficiency comparable to standard LoRA, consistently outperforming existing methods across multiple tasks.
Related papers
- DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation [26.24723718425076]
Mixture-of-Experts (MoE) has become a prominent paradigm for scaling Large Language Models (LLMs)<n>We propose a Dynamic Rank LoRA framework named DR-LoRA, which dynamically grows expert LoRA ranks during fine-tuning based on task-specific demands.<n>Experiments on multiple benchmarks demonstrate that DR-LoRA consistently outperforms standard LoRA and static allocation strategies under the same parameter budget.
arXiv Detail & Related papers (2026-01-08T10:58:51Z) - FediLoRA: Heterogeneous LoRA for Federated Multimodal Fine-tuning under Missing Modalities [9.507134068207597]
FediLoRA is a framework for federated multimodal fine-tuning under heterogeneous LoRA ranks and missing modalities.<n>It achieves superior performance over competitive baselines in both global and personalized settings.
arXiv Detail & Related papers (2025-09-01T10:40:13Z) - LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing [17.171872354057694]
We propose the LoRA-Mixer, a modular and lightweight MoE framework that integrates LoRA experts.<n>Our core innovation lies in replacing the projection matrices of the attention module's input/output linear layers with task-specific LoRA experts.<n>LoRA-Mixer achieves significant improvements on datasets such as GSM8K, HumanEval, and MedQA.
arXiv Detail & Related papers (2025-06-17T14:58:54Z) - A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models [22.457766373989365]
Low-Rank Adapters (LoRAs) have been substantially adopted across various fields, including instruction tuning and domain adaptation.<n>To address the limited expressive capacity of LoRA, the Mixture-of-Expert (MoE) has been introduced for incorporating multiple LoRA adapters.<n>We propose a new training strategy for MoE-LoRA, to stabilize and boost its feature learning procedure by multi-space projections.
arXiv Detail & Related papers (2025-02-20T05:58:53Z) - In-Context Meta LoRA Generation [61.690065588534296]
Low-rank Adaptation (LoRA) has demonstrated remarkable capabilities for task specific fine-tuning.<n>We propose In-Context Meta LoRA (ICM-LoRA), a novel approach that efficiently achieves task-specific customization of large language models.<n>ICM-LoRA enables more accurate LoRA parameter reconstruction than current parameter reconstruction methods.
arXiv Detail & Related papers (2025-01-29T13:12:01Z) - Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning [53.053604713064544]
Low-Rank Adaptation (LoRA) is widely used for adapting large language models (LLMs) to specific domains due to its efficiency and modularity.<n>Recent works adopt Mixture of Experts (MoE) by treating each LoRA module as an expert, thereby mitigating task interference through multiple specialized LoRA modules.<n>While effective, these methods often isolate knowledge within individual tasks, failing to fully exploit the shared knowledge across related tasks.<n>We propose Single-ranked Mixture of Experts LoRA (textbfSMoRA), which embeds MoE into LoRA by textittreating each rank as an
arXiv Detail & Related papers (2025-01-25T06:56:39Z) - Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection.
The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z) - LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild [76.67343971195267]
Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
arXiv Detail & Related papers (2024-02-15T15:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.