DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation
- URL: http://arxiv.org/abs/2601.04823v3
- Date: Wed, 14 Jan 2026 08:04:55 GMT
- Title: DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation
- Authors: Guanzhi Deng, Bo Li, Ronghao Chen, Huacan Wang, Lijie Wen, Linqi Song,
- Abstract summary: Mixture-of-Experts (MoE) has become a prominent paradigm for scaling Large Language Models (LLMs)<n>We propose a Dynamic Rank LoRA framework named DR-LoRA, which dynamically grows expert LoRA ranks during fine-tuning based on task-specific demands.<n>Experiments on multiple benchmarks demonstrate that DR-LoRA consistently outperforms standard LoRA and static allocation strategies under the same parameter budget.
- Score: 26.24723718425076
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mixture-of-Experts (MoE) has become a prominent paradigm for scaling Large Language Models (LLMs). Parameter-efficient fine-tuning (PEFT), such as LoRA, is widely adopted to adapt pretrained MoE LLMs to downstream tasks. However, existing approaches assign identical LoRA ranks to all experts, overlooking the intrinsic functional specialization within MoE LLMs. This uniform allocation leads to resource mismatch, task-relevant experts are under-provisioned while less relevant ones receive redundant parameters. We propose a Dynamic Rank LoRA framework named DR-LoRA, which dynamically grows expert LoRA ranks during fine-tuning based on task-specific demands. DR-LoRA employs an Expert Saliency Scoring mechanism that integrates expert routing frequency and LoRA rank importance to quantify each expert's demand for additional capacity. Experts with higher saliency scores are prioritized for rank expansion, enabling the automatic formation of a heterogeneous rank distribution tailored to the target task. Experiments on multiple benchmarks demonstrate that DR-LoRA consistently outperforms standard LoRA and static allocation strategies under the same parameter budget, achieving superior task performance with more efficient parameter utilization.
Related papers
- HiLoRA: Adaptive Hierarchical LoRA Routing for Training-Free Domain Generalization [39.23407996213986]
Low-Rank Adaptation (LoRA) has emerged as a widely used technique for adapting large language models to new domains.<n>Existing methods often rely on explicit task labels or additional training, which are impractical for deployment.<n>We propose textttHiLoRA, a training-free framework that performs adaptive hierarchical routing over LoRA pools.
arXiv Detail & Related papers (2025-10-14T08:19:13Z) - Semantic-guided LoRA Parameters Generation [22.648880814012184]
Low-Rank Adaptation (LoRA) has demonstrated strong generalization capabilities across a variety of tasks for efficiently fine-tuning AI models.<n>SG-LoRA is the first of its kind framework to efficiently produce user-specific LoRA without additional training on user tasks or access to user-specific data.<n>SG-LoRA enables the real-time construction of LoRA models aligned with individual intents by distilling knowledge from prominent LoRA experts.
arXiv Detail & Related papers (2025-09-05T14:43:41Z) - BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods.<n>We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z) - In-Context Meta LoRA Generation [61.690065588534296]
Low-rank Adaptation (LoRA) has demonstrated remarkable capabilities for task specific fine-tuning.<n>We propose In-Context Meta LoRA (ICM-LoRA), a novel approach that efficiently achieves task-specific customization of large language models.<n>ICM-LoRA enables more accurate LoRA parameter reconstruction than current parameter reconstruction methods.
arXiv Detail & Related papers (2025-01-29T13:12:01Z) - Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning [53.053604713064544]
Low-Rank Adaptation (LoRA) is widely used for adapting large language models (LLMs) to specific domains due to its efficiency and modularity.<n>Recent works adopt Mixture of Experts (MoE) by treating each LoRA module as an expert, thereby mitigating task interference through multiple specialized LoRA modules.<n>While effective, these methods often isolate knowledge within individual tasks, failing to fully exploit the shared knowledge across related tasks.<n>We propose Single-ranked Mixture of Experts LoRA (textbfSMoRA), which embeds MoE into LoRA by textittreating each rank as an
arXiv Detail & Related papers (2025-01-25T06:56:39Z) - SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning [73.93639228235622]
Continual Learning with foundation models has emerged as a promising paradigm to exploit abundant knowledge acquired during pre-training for tackling sequential tasks.<n>Existing prompt-based and Low-Rank Adaptation-based (LoRA-based) methods often require expanding a prompt/LoRA pool or retaining samples of previous tasks.<n>We propose Scalable Decoupled LoRA (SD-LoRA) for class incremental learning, which continually separates the learning of the magnitude and direction of LoRA components without rehearsal.
arXiv Detail & Related papers (2025-01-22T20:00:41Z) - MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning [29.957620178740186]
In multi-task scenarios, challenges such as training imbalance and the seesaw effect frequently emerge.
We propose Mixture of Asymmetric Low-Rank Adaptaion (MALoRA) as a flexible fine-tuning framework.
MALoRA reduces the number of trainable parameters by 30% to 48%, increases training speed by 1.2x, and matches the computational efficiency of single-task LoRA models.
arXiv Detail & Related papers (2024-10-30T07:53:52Z) - Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild [76.67343971195267]
Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
arXiv Detail & Related papers (2024-02-15T15:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.