Learn it or Leave it: Module Composition and Pruning for Continual Learning
- URL: http://arxiv.org/abs/2406.18708v1
- Date: Wed, 26 Jun 2024 19:18:28 GMT
- Title: Learn it or Leave it: Module Composition and Pruning for Continual Learning
- Authors: Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze,
- Abstract summary: MoCL-P is a lightweight continual learning method that balances knowledge integration and computational overhead.
Our evaluation shows that MoCL-P achieves state-of-the-art performance and improves parameter efficiency by up to three times.
- Score: 48.07144492109635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world environments, continual learning is essential for machine learning models, as they need to acquire new knowledge incrementally without forgetting what they have already learned. While pretrained language models have shown impressive capabilities on various static tasks, applying them to continual learning poses significant challenges, including avoiding catastrophic forgetting, facilitating knowledge transfer, and maintaining parameter efficiency. In this paper, we introduce MoCL-P, a novel lightweight continual learning method that addresses these challenges simultaneously. Unlike traditional approaches that continuously expand parameters for newly arriving tasks, MoCL-P integrates task representation-guided module composition with adaptive pruning, effectively balancing knowledge integration and computational overhead. Our evaluation across three continual learning benchmarks with up to 176 tasks shows that MoCL-P achieves state-of-the-art performance and improves parameter efficiency by up to three times, demonstrating its potential for practical applications where resource requirements are constrained.
Related papers
- M2Distill: Multi-Modal Distillation for Lifelong Imitation Learning [9.15567555909617]
M2Distill is a multi-modal distillation-based method for lifelong imitation learning.
We regulate the shifts in latent representations across different modalities from previous to current steps.
We ensure that the learned policy retains its ability to perform previously learned tasks while seamlessly integrating new skills.
arXiv Detail & Related papers (2024-09-30T01:43:06Z) - TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning [41.28933724210434]
Language model continual learning (CL) has recently attracted significant interest for its ability to adapt large language models (LLMs) to dynamic real-world scenarios without retraining.
Existing approaches commonly utilize multiple parameter-efficient fine-tuning (PEFT) blocks to acquire task-specific knowledge, yet these methods are inefficient and fail to leverage potential knowledge transfer across tasks.
We introduce a novel CL framework for language models, named Task Skill Localization and Consolidation (TaSL), which boosts knowledge transfer without depending on memory replay.
arXiv Detail & Related papers (2024-08-09T17:44:45Z) - Learning to Learn without Forgetting using Attention [5.6739565497512405]
Continual learning (CL) refers to the ability to continually learn over time by accommodating new knowledge while retaining previously learned experience.
Current machine learning methods are highly prone to overwrite previously learned patterns and thus forget past experience.
Since hand-crafting effective update mechanisms is difficult, we propose meta-learning a transformer-based to enhance CL.
arXiv Detail & Related papers (2024-08-06T14:25:23Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Rehearsal-Free Modular and Compositional Continual Learning for Language Models [48.07144492109635]
Continual learning aims at incrementally acquiring new knowledge while not forgetting existing knowledge.
We propose MoCL, a rehearsal-free Modular and Compositional Continual Learning framework.
arXiv Detail & Related papers (2024-03-31T20:28:44Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z) - SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world.
Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input.
We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.