Metalearning Continual Learning Algorithms
- URL: http://arxiv.org/abs/2312.00276v3
- Date: Mon, 17 Feb 2025 18:06:07 GMT
- Title: Metalearning Continual Learning Algorithms
- Authors: Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber,
- Abstract summary: We propose Automated Continual Learning (ACL) to train self-referential neural networks to continual (meta)learning algorithms.<n>ACL encodes continual learning (CL) desiderata -- good performance on both old and new tasks -- into its metalearning objectives.<n>Our experiments demonstrate that ACL effectively resolves "in-context catastrophic forgetting," a problem that naive in-context learning algorithms suffer from.
- Score: 42.710124929514066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: General-purpose learning systems should improve themselves in open-ended fashion in ever-changing environments. Conventional learning algorithms for neural networks, however, suffer from catastrophic forgetting (CF), i.e., previously acquired skills are forgotten when a new task is learned. Instead of hand-crafting new algorithms for avoiding CF, we propose Automated Continual Learning (ACL) to train self-referential neural networks to metalearn their own in-context continual (meta)learning algorithms. ACL encodes continual learning (CL) desiderata -- good performance on both old and new tasks -- into its metalearning objectives. Our experiments demonstrate that ACL effectively resolves "in-context catastrophic forgetting," a problem that naive in-context learning algorithms suffer from; ACL-learned algorithms outperform both hand-crafted learning algorithms and popular meta-continual learning methods on the Split-MNIST benchmark in the replay-free setting, and enables continual learning of diverse tasks consisting of multiple standard image classification datasets. We also discuss the current limitations of in-context CL by comparing ACL with state-of-the-art CL methods that leverage pre-trained models. Overall, we bring several novel perspectives into the long-standing problem of CL.
Related papers
- Active Learning for Continual Learning: Keeping the Past Alive in the Present [17.693559751968742]
We propose AccuACL, Accumulated informativeness-based Active Continual Learning.
We show that AccuACL significantly outperforms AL baselines across various CL algorithms.
arXiv Detail & Related papers (2025-01-24T06:46:58Z) - Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - CLEO: Continual Learning of Evolving Ontologies [12.18795037817058]
Continual learning (CL) aims to instill the lifelong learning of humans in intelligent systems.
General learning processes are not just limited to learning information, but also refinement of existing information.
CLEO is motivated by the need for intelligent systems to adapt to real-world changes over time.
arXiv Detail & Related papers (2024-07-11T11:32:33Z) - Forward-Backward Knowledge Distillation for Continual Clustering [14.234785944941672]
Unsupervised Continual Learning (UCL) is a burgeoning field in machine learning, focusing on enabling neural networks to sequentially learn tasks without explicit label information.
Catastrophic Forgetting (CF) poses a significant challenge in continual learning, especially in UCL, where labeled information of data is not accessible.
We introduce the concept of Unsupervised Continual Clustering (UCC), demonstrating enhanced performance and memory efficiency in clustering across various tasks.
arXiv Detail & Related papers (2024-05-29T16:13:54Z) - A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - On the Effectiveness of Equivariant Regularization for Robust Online
Continual Learning [17.995662644298974]
Continual Learning (CL) approaches seek to bridge this gap by facilitating the transfer of knowledge to both previous tasks and future ones.
Recent research has shown that self-supervision can produce versatile models that can generalize well to diverse downstream tasks.
We propose Continual Learning via Equivariant Regularization (CLER), an OCL approach that leverages equivariant tasks for self-supervision.
arXiv Detail & Related papers (2023-05-05T16:10:31Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Online Continual Learning with Contrastive Vision Transformer [67.72251876181497]
This paper proposes a framework Contrastive Vision Transformer (CVT) to achieve a better stability-plasticity trade-off for online CL.
Specifically, we design a new external attention mechanism for online CL that implicitly captures previous tasks' information.
Based on the learnable focuses, we design a focal contrastive loss to rebalance contrastive learning between new and past classes and consolidate previously learned representations.
arXiv Detail & Related papers (2022-07-24T08:51:02Z) - Hebbian Continual Representation Learning [9.54473759331265]
Continual Learning aims to bring machine learning into a more realistic scenario.
We investigate whether biologically inspired Hebbian learning is useful for tackling continual challenges.
arXiv Detail & Related papers (2022-06-28T09:21:03Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - Incremental Learning from Low-labelled Stream Data in Open-Set Video
Face Recognition [0.0]
We propose a novel incremental learning approach which combines a deep features encoder with an Open-Set Dynamic Ensembles of SVM.
Our method can use unsupervised operational data to enhance recognition.
Results show a benefit of up to 15% F1-score increase respect to non-adaptive state-of-the-art methods.
arXiv Detail & Related papers (2020-12-17T13:28:13Z) - Collaborative Method for Incremental Learning on Classification and
Generation [32.07222897378187]
We introduce a novel algorithm, Incremental Class Learning with Attribute Sharing (ICLAS), for incremental class learning with deep neural networks.
As one of its component, incGAN, can generate images with increased variety compared with the training data.
Under challenging environment of data deficiency, ICLAS incrementally trains classification and the generation networks.
arXiv Detail & Related papers (2020-10-29T06:34:53Z) - Neuromodulated Neural Architectures with Local Error Signals for
Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation.
We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.