Dynamic Routing Between Experts: A Data-Efficient Approach to Continual Learning in Vision-Language Models
- URL: http://arxiv.org/abs/2511.01831v2
- Date: Tue, 04 Nov 2025 03:19:41 GMT
- Title: Dynamic Routing Between Experts: A Data-Efficient Approach to Continual Learning in Vision-Language Models
- Authors: Jay Mohta, Kenan Emir Ak, Dimitrios Dimitriadis, Yan Xu, Mingwei Shen,
- Abstract summary: Vision-Language Models (VLMs) suffer from catastrophic forgetting when sequentially fine-tuned on new tasks.<n>We introduce a routing-based approach that enables the integration of new tasks while preserving the foundational knowledge acquired during pretraining.
- Score: 10.431923437214719
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision-Language Models (VLMs) suffer from catastrophic forgetting when sequentially fine-tuned on new tasks, degrading performance on previously learned foundational and task-specific capabilities. While multi-task learning can mitigate forgetting, it requires simultaneous access to all datasets and imposes computational overhead that scales linearly with the number of tasks. In this work, we introduce a routing-based approach that enables the integration of new tasks while preserving the foundational knowledge acquired during pretraining. We evaluate our method using InternVL-2 models (2B and 8B parameters) and demonstrate that routing preserves the model's foundational capabilities by maintaining performance on general-purpose benchmarks such as ChartQA, MMBench, and DocVQA, while simultaneously improving accuracy on specialized tasks. Importantly, our approach achieves this without requiring concurrent access to data from all tasks, avoiding the significant computational and data overhead associated with traditional multi-task learning. We further conduct extensive ablation studies to evaluate the scalability and robustness of routing-based learning, showing that the approach is resilient to a growing number of tasks and performs particularly well when new tasks are semantically related. Finally, we show that the routing mechanism enables superior cross-modal transfer between language and vision capabilities, allowing knowledge learned in one modality to enhance performance in another capability not achieved by existing continual learning methods.
Related papers
- Generalisation in Multitask Fitted Q-Iteration and Offline Q-learning [0.0]
We study offline multitask reinforcement learning in settings where multiple tasks share a low-rank representation of their action-value functions.<n>We analyze a multitask variant of fitted Q-iteration that jointly learns a shared representation and task-specific value functions.<n>Our results clarify the role of shared representations in multitask offline Q-learning and provide theoretical insight into when and how multitask structure can improve generalization.
arXiv Detail & Related papers (2025-12-23T10:20:11Z) - Continually Evolving Skill Knowledge in Vision Language Action Model [23.63528439700931]
Development of general robot intelligence in open environments requires continual skill learning.<n>We propose Stellar VLA, a knowledge-driven continual learning framework with two variants: T-Stellar, modeling task-centric knowledge space, and TS-Stellar, capturing hierarchical task-skill structure.<n> Experiments on the LIBERO benchmark and real-world tasks show over 50 percentage average improvement in final success rates relative to baselines.
arXiv Detail & Related papers (2025-11-22T15:00:08Z) - LLaVA-c: Continual Improved Visual Instruction Tuning [41.83222301318741]
Multimodal models like LLaVA-1.5 achieve state-of-the-art visual understanding through visual instruction tuning on multitask datasets.<n>We show that task-by-task continual learning can achieve results that match or surpass multitask joint learning.
arXiv Detail & Related papers (2025-06-10T10:27:52Z) - Exploiting Task Relationships for Continual Learning Using Transferability-Aware Task Embeddings [8.814732457885022]
Continual learning (CL) has been a critical topic in contemporary deep neural network applications.<n>We propose a transferability-aware task embedding, termed H-embedding, and construct a hypernet framework under its guidance.
arXiv Detail & Related papers (2025-02-17T09:52:19Z) - Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods.
MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections.
Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z) - Task-Attentive Transformer Architecture for Continual Learning of
Vision-and-Language Tasks Using Knowledge Distillation [18.345183818638475]
Continual learning (CL) can serve as a remedy through enabling knowledge-transfer across sequentially arriving tasks.
We develop a transformer-based CL architecture for learning bimodal vision-and-language tasks.
Our approach is scalable learning to a large number of tasks because it requires little memory and time overhead.
arXiv Detail & Related papers (2023-03-25T10:16:53Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Self-Attention Meta-Learner for Continual Learning [5.979373021392084]
Self-Attention Meta-Learner (SAM) learns a prior knowledge for continual learning that permits learning a sequence of tasks.
SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task.
We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task inference.
arXiv Detail & Related papers (2021-01-28T17:35:04Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Measuring and Harnessing Transference in Multi-Task Learning [58.48659733262734]
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
We analyze the dynamics of information transfer, or transference, across tasks throughout training.
arXiv Detail & Related papers (2020-10-29T08:25:43Z) - Bilevel Continual Learning [76.50127663309604]
We present a novel framework of continual learning named "Bilevel Continual Learning" (BCL)
Our experiments on continual learning benchmarks demonstrate the efficacy of the proposed BCL compared to many state-of-the-art methods.
arXiv Detail & Related papers (2020-07-30T16:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.