Modular Memory is the Key to Continual Learning Agents
- URL: http://arxiv.org/abs/2603.01761v1
- Date: Mon, 02 Mar 2026 11:40:05 GMT
- Title: Modular Memory is the Key to Continual Learning Agents
- Authors: Vaggelis Dorovatas, Malte Schwerin, Andrew D. Bagdanov, Lucas Caccia, Antonio Carta, Laurent Charlin, Barbara Hammer, Tyler L. Hayes, Timm Hess, Christopher Kanan, Dhireesha Kudithipudi, Xialei Liu, Vincenzo Lomonaco, Jorge Mendez-Mendez, Darshan Patil, Ameya Prabhu, Elisa Ricci, Tinne Tuytelaars, Gido M. van de Ven, Liyuan Wang, Joost van de Weijer, Jonghyun Choi, Martin Mundt, Rahaf Aljundi,
- Abstract summary: We argue that combining the strengths of In-Weight Learning (IWL) and the newly emerged capabilities of In-Context Learning (ICL) through the design of modular memory is the missing piece for continual adaptation at scale.<n>We outline a conceptual framework for modular memory-centric architectures that leverage ICL for rapid adaptation and knowledge accumulation, and IWL for stable updates to model capabilities.
- Score: 100.09688599754465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation models have transformed machine learning through large-scale pretraining and increased test-time compute. Despite surpassing human performance in several domains, these models remain fundamentally limited in continuous operation, experience accumulation, and personalization, capabilities that are central to adaptive intelligence. While continual learning research has long targeted these goals, its historical focus on in-weight learning (IWL), i.e., updating a single model's parameters to absorb new knowledge, has rendered catastrophic forgetting a persistent challenge. Our position is that combining the strengths of In-Weight Learning (IWL) and the newly emerged capabilities of In-Context Learning (ICL) through the design of modular memory is the missing piece for continual adaptation at scale. We outline a conceptual framework for modular memory-centric architectures that leverage ICL for rapid adaptation and knowledge accumulation, and IWL for stable updates to model capabilities, charting a practical roadmap toward continually learning agents.
Related papers
- SALAAD: Sparse And Low-Rank Adaptation via ADMM for Large Language Model Inference [38.037874715181964]
We propose SALAAD, a plug-and-play framework that induces sparse and low-rank structures during training.<n>Experiments across model scales show that SALAAD substantially reduces memory consumption during deployment.<n>A single training run yields a continuous spectrum of model capacities, enabling smooth and elastic deployment across diverse memory budgets.
arXiv Detail & Related papers (2026-02-01T00:00:11Z) - Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning [58.533203990515034]
Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL)<n>We show that dynamic sparse training strategies provide module-specific benefits that complement the primary scalability foundation established by architectural improvements.<n>We finally distill these insights into Module-Specific Training (MST), a practical framework that exploits the benefits of architectural improvements and demonstrates substantial scalability gains across diverse RL algorithms without algorithmic modifications.
arXiv Detail & Related papers (2025-10-14T03:03:08Z) - Continual Learning for Generative AI: From LLMs to MLLMs and Beyond [56.29231194002407]
We present a comprehensive survey of continual learning methods for mainstream generative AI models.<n>We categorize these approaches into three paradigms: architecture-based, regularization-based, and replay-based.<n>We analyze continual learning setups for different generative models, including training objectives, benchmarks, and core backbones.
arXiv Detail & Related papers (2025-06-16T02:27:25Z) - Partitioned Memory Storage Inspired Few-Shot Class-Incremental learning [2.9845592719739127]
Few-Shot Class-Incremental Learning (FSCIL) focuses on continuous learning of new categories with limited samples without forgetting old knowledge.<n>Our paper develops a method that learns independent models for each session. It can inherently prevent catastrophic forgetting.<n>Our method provides a fresh viewpoint for FSCIL and demonstrates the state-of-the-art performance on CIFAR-100 and mini-ImageNet datasets.
arXiv Detail & Related papers (2025-04-29T14:11:06Z) - CalFuse: Multi-Modal Continual Learning via Feature Calibration and Parameter Fusion [17.68751409041168]
Class-Continual Learning (CCL) addresses this challenge by incrementally incorporating new class knowledge without revisiting historical data.<n>Recent advances in Vision-Language Models (VLMs) such as CLIP demonstrate significant potential for CCL by leveraging pre-trained multi-modal knowledge.<n>We propose CalFuse, a framework that synergizes feature parameter Fusion to enable effective multi-modal knowledge integration.
arXiv Detail & Related papers (2025-03-24T13:44:12Z) - Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Temporal-Difference Variational Continual Learning [77.92320830700797]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Learn it or Leave it: Module Composition and Pruning for Continual Learning [48.07144492109635]
MoCL-P is a lightweight continual learning method that balances knowledge integration and computational overhead.
Our evaluation shows that MoCL-P achieves state-of-the-art performance and improves parameter efficiency by up to three times.
arXiv Detail & Related papers (2024-06-26T19:18:28Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z) - Learning an evolved mixture model for task-free continual learning [11.540150938141034]
We address the Task-Free Continual Learning (TFCL) in which a model is trained on non-stationary data streams with no explicit task information.
We introduce two simple dropout mechanisms to selectively remove stored examples in order to avoid memory overload.
arXiv Detail & Related papers (2022-07-11T16:01:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.