Adaptive Shared Experts with LoRA-Based Mixture of Experts for Multi-Task Learning
- URL: http://arxiv.org/abs/2510.00570v1
- Date: Wed, 01 Oct 2025 06:49:19 GMT
- Title: Adaptive Shared Experts with LoRA-Based Mixture of Experts for Multi-Task Learning
- Authors: Minghao Yang, Ren Togo, Guang Li, Takahiro Ogawa, Miki Haseyama,
- Abstract summary: Mixture-of-Experts (MoE) has emerged as a powerful framework for multi-task learning (MTL)<n>Existing MoE-MTL methods often rely on single-task pretrained backbones and suffer from redundant adaptation and inefficient knowledge sharing.<n>We propose adaptive shared experts (ASE) within a low-rank adaptation (LoRA) based MoE, where shared experts are assigned router-computed gating weights jointly normalized with sparse experts.
- Score: 49.90176890917986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mixture-of-Experts (MoE) has emerged as a powerful framework for multi-task learning (MTL). However, existing MoE-MTL methods often rely on single-task pretrained backbones and suffer from redundant adaptation and inefficient knowledge sharing during the transition from single-task to multi-task learning (STL to MTL). To address these limitations, we propose adaptive shared experts (ASE) within a low-rank adaptation (LoRA) based MoE, where shared experts are assigned router-computed gating weights jointly normalized with sparse experts. This design facilitates STL to MTL transition, enhances expert specialization, and cooperation. Furthermore, we incorporate fine-grained experts by increasing the number of LoRA experts while proportionally reducing their rank, enabling more effective knowledge sharing under a comparable parameter budget. Extensive experiments on the PASCAL-Context benchmark, under unified training settings, demonstrate that ASE consistently improves performance across diverse configurations and validates the effectiveness of fine-grained designs for MTL.
Related papers
- SAME: Stabilized Mixture-of-Experts for Multimodal Continual Instruction Tuning [83.66308307152808]
We propose StAbilized Mixture-of-Experts (SAME) for Multimodal Continual Instruction Tuning (MCIT)<n>SAME stabilizes expert selection by decomposing routing dynamics into subspaces and updating only task-relevant directions.<n>It also introduces adaptive expert activation to freeze selected experts during training, reducing redundant and cross-task interference.
arXiv Detail & Related papers (2026-02-02T11:47:06Z) - Multi-Task Dense Prediction Fine-Tuning with Mixture of Fine-Grained Experts [22.936728143586443]
Multi-task learning (MTL) for dense prediction has shown promising results but still faces challenges in balancing shared representations with task-specific specialization.<n>We introduce a novel Fine-Grained Mixture of Experts architecture that explores MoE-based MTL models through a combination of three key innovations and fine-tuning.
arXiv Detail & Related papers (2025-07-25T08:59:30Z) - Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion Recognition [16.14787920254091]
We present a novel Multimodal Mixture of Low-Rank Experts (MMoLRE) method for multimodal sentiment analysis (MSA) and multimodal emotion recognition (MER)<n>MMoLRE utilizes shared and task-specific experts to distinctly model common and unique task characteristics, thereby avoiding parameter conflicts.<n>Experiments on the CMU-MOSI and CMU-MOSEI benchmarks demonstrate that MMoLRE achieves state-of-the-art performance on the MSA task and competitive results on the MER task.
arXiv Detail & Related papers (2025-05-20T09:46:56Z) - Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction [28.800518091590117]
Multimodal Information Extraction (MIE) has gained attention for extracting structured information from multimedia sources.<n>Traditional methods tackle MIE tasks separately, missing opportunities to share knowledge across tasks.<n>We propose collaborative multi-LoRA experts with achievement-based multi-task loss for MIE tasks.
arXiv Detail & Related papers (2025-05-08T03:16:32Z) - MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning [74.43869839954168]
We propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing MTL capabilities.<n> MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information and capture shared knowledge.<n>This approach enables pre-trained models to jointly adapt to different target domains with a limited number of trainable parameters.
arXiv Detail & Related papers (2024-10-12T08:32:26Z) - TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition [61.91764883512776]
We introduce an innovative PEFT method, TeamLoRA, consisting of a collaboration and competition module for experts.
By doing so, TeamLoRA connects the experts as a "Team" with internal collaboration and competition, enabling a faster and more accurate PEFT paradigm for multi-task learning.
arXiv Detail & Related papers (2024-08-19T09:58:53Z) - Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts [74.40198929049959]
Large multi-modal models (LMMs) exhibit remarkable performance across numerous tasks.
generalist LMMs often suffer from performance degradation when tuned over a large collection of tasks.
We propose Omni-SMoLA, an architecture that uses the Soft MoE approach to mix many multimodal low rank experts.
arXiv Detail & Related papers (2023-12-01T23:04:27Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.