Related papers: DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning

DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning

URL: http://arxiv.org/abs/2603.01632v1
Date: Mon, 02 Mar 2026 09:07:28 GMT
Title: DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning
Authors: Xiwei Liu, Yulong Li, Feilong Tang, Imran Razzak,
Abstract summary: DeLo is the first framework to leverage a novel dual-decomposed low-rank expert architecture for CMML.<n>Our method significantly outperforms state-of-the-art approaches.<n>This highlights the value of a principled, architecturally-aware LoRA design for real-world multimodal challenges.
Score: 33.51000015118141
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Adapting Large Multimodal Models (LMMs) to real-world scenarios poses the dual challenges of learning from sequential data streams while handling frequent modality incompleteness, a task known as Continual Missing Modality Learning (CMML). However, existing works on CMML have predominantly relied on prompt tuning, a technique that struggles with this task due to cross-task interference between its learnable prompts in their shared embedding space. A naive application of Low-Rank Adaptation (LoRA) with modality-shared module will also suffer modality interference from competing gradients. To this end, we propose DeLo, the first framework to leverage a novel dual-decomposed low-rank expert architecture for CMML. Specifically, this architecture resolves modality interference through decomposed LoRA expert, dynamically composing LoRA update matrix with rank-one factors from disentangled modality-specific factor pools. Embedded within a task-partitioned framework that structurally prevents catastrophic forgetting, this expert system is supported by two key mechanisms: a Cross-Modal Guided Routing strategy to handle incomplete data and a Task-Key Memory for efficient, task-agnostic inference. Extensive experiments on established CMML benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches. This highlights the value of a principled, architecturally-aware LoRA design for real-world multimodal challenges.

Related papers

Decomposing and Composing: Towards Efficient Vision-Language Continual Learning via Rank-1 Expert Pool in a Single LoRA [50.97792275353563]
We introduce a novel framework that restructures a single Low-Rank Adaptation (LoRA) module as a decomposable Rank-1 Expert Pool.<n>Our method learns to dynamically compose a sparse, task-specific update by selecting from this expert pool, guided by the semantics of the [Guided] token.
arXiv Detail & Related papers (2026-01-30T10:54:51Z)
From Sparse Decisions to Dense Reasoning: A Multi-attribute Trajectory Paradigm for Multimodal Moderation [59.27094165576015]
We propose a novel learning paradigm (UniMod) that transitions from sparse decision-making to dense reasoning traces.<n>By constructing structured trajectories encompassing evidence grounding, modality assessment, risk mapping, policy decision, and response generation, we reformulate monolithic decision tasks into a multi-dimensional boundary learning process.<n>We introduce specialized optimization strategies to decouple task-specific parameters and rebalance training dynamics, effectively resolving interference between diverse objectives in multi-task learning.
arXiv Detail & Related papers (2026-01-28T09:29:40Z)
Rethinking Efficient Mixture-of-Experts for Remote Sensing Modality-Missing Classification [33.302856478333524]
Multimodal classification in remote sensing often suffers from missing modalities caused by environmental interference, sensor failures, or atmospheric effects.<n>Existing two-stage adaptation methods are computationally expensive and assume complete multimodal data during training, limiting their generalization to real-world incompleteness.<n>We propose a Missing-aware Mixture-of-Loras framework that reformulates modality missing as a multi-task learning problem.
arXiv Detail & Related papers (2025-11-14T16:31:37Z)
Diagnose, Localize, Align: A Full-Stack Framework for Reliable LLM Multi-Agent Systems under Instruction Conflicts [75.20929587906228]
Large Language Model (LLM)-powered multi-agent systems (MAS) have rapidly advanced collaborative reasoning, tool use, and role-specialized coordination in complex tasks.<n>However, reliability-critical deployment remains hindered by a systemic failure mode: hierarchical compliance under instruction conflicts.
arXiv Detail & Related papers (2025-09-27T08:43:34Z)
Empowering Large Language Model for Sequential Recommendation via Multimodal Embeddings and Semantic IDs [28.752042722391934]
Sequential recommendation (SR) aims to capture users' dynamic interests and sequential patterns based on their historical interactions.<n>MME-SID integrates multimodal embeddings and quantized embeddings to mitigate embedding collapse.<n>Extensive experiments on three public datasets validate the superior performance of MME-SID.
arXiv Detail & Related papers (2025-09-02T07:02:29Z)
LoRA in LoRA: Towards Parameter-Efficient Architecture Expansion for Continual Visual Instruction Tuning [12.165720711684758]
We introduce LiLoRA, a highly efficient architecture expansion method tailored for CVIT in MLLMs.<n>LiLoRA shares the LoRA matrix A across tasks to reduce redundancy, applies an additional low-rank decomposition to matrix B to minimize task-specific parameters, and incorporates a cosine-regularized stability loss to preserve consistency over time.<n>Experiments show that LiLoRA consistently achieves superior performance in sequential task learning while significantly improving parameter efficiency compared to existing approaches.
arXiv Detail & Related papers (2025-08-08T10:32:38Z)
Dynamic Mixture of Curriculum LoRA Experts for Continual Multimodal Instruction Tuning [45.019751165506946]
Continual multimodal instruction tuning is crucial for adapting Multimodal Large Language Models (MLLMs) to evolving tasks.<n>We propose a novel Dynamic Mixture of Curriculum LoRA Experts (D-MoLE) method, which automatically evolves MLLM's architecture with controlled parameter budgets to continually adapt to new tasks.<n>Specifically, we propose a dynamic layer-wise expert allocator, which automatically allocates LoRA experts across layers to resolve architecture conflicts.<n>Then, we propose a gradient-based inter-modal continual curriculum, which adjusts the update ratio of each module in MLLM based on the difficulty of each
arXiv Detail & Related papers (2025-06-13T11:03:46Z)
ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation [96.86211867758652]
Low-Rank Adaptation (LoRA) is widely adopted for downstream fine-tuning of foundation models.<n>We propose ThanoRA, a Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation framework.
arXiv Detail & Related papers (2025-05-24T11:01:45Z)
Activation-Guided Consensus Merging for Large Language Models [25.68958388022476]
We present textbfActivation-Guided textbfConsensus textbfMerging (textbfACM), a plug-and-play merging framework that determines layer-specific merging coefficients.<n>Experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods.
arXiv Detail & Related papers (2025-05-20T07:04:01Z)
AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMs [5.018961516699825]
AsymLoRA is a parameter-efficient tuning framework that unifies knowledge modularization and cross-modal coordination.<n>AsymLoRA consistently surpasses both vanilla LoRA, which captures only commonalities, and LoRA-MoE, which focuses solely on conflicts.
arXiv Detail & Related papers (2025-02-27T12:21:02Z)
Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation [58.799397354312596]
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks.<n>Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation.<n>In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges.
arXiv Detail & Related papers (2025-02-18T03:20:50Z)
Multimodal Instruction Tuning with Conditional Mixture of LoRA [51.58020580970644]
This paper introduces a novel approach that integrates multimodal instruction tuning with Low-Rank Adaption (LoRA)<n>It innovates upon LoRA by dynamically constructing low-rank adaptation matrices tailored to the unique demands of each input instance.<n> Experimental results on various multimodal evaluation datasets indicate that MixLoRA not only outperforms the conventional LoRA with the same or even higher ranks.
arXiv Detail & Related papers (2024-02-24T20:15:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.