Related papers: Self-Attention Meta-Learner for Continual Learning

Self-Attention Meta-Learner for Continual Learning

URL: http://arxiv.org/abs/2101.12136v1
Date: Thu, 28 Jan 2021 17:35:04 GMT
Title: Self-Attention Meta-Learner for Continual Learning
Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
Abstract summary: Self-Attention Meta-Learner (SAM) learns a prior knowledge for continual learning that permits learning a sequence of tasks. SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task. We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task inference.
Score: 5.979373021392084
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continual learning aims to provide intelligent agents capable of learning multiple tasks sequentially with neural networks. One of its main challenging, catastrophic forgetting, is caused by the neural networks non-optimal ability to learn in non-stationary distributions. In most settings of the current approaches, the agent starts from randomly initialized parameters and is optimized to master the current task regardless of the usefulness of the learned representation for future tasks. Moreover, each of the future tasks uses all the previously learned knowledge although parts of this knowledge might not be helpful for its learning. These cause interference among tasks, especially when the data of previous tasks is not accessible. In this paper, we propose a new method, named Self-Attention Meta-Learner (SAM), which learns a prior knowledge for continual learning that permits learning a sequence of tasks, while avoiding catastrophic forgetting. SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task. Each task builds a specific representation branch on top of the selected knowledge, avoiding the interference between tasks. We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task agnostic inference. We empirically show that we can achieve a better performance than several state-of-the-art methods for continual learning by building on the top of selected representation learned by SAM. We also show the role of the meta-attention mechanism in boosting informative features corresponding to the input data and identifying the correct target in the task agnostic inference. Finally, we demonstrate that popular existing continual learning methods gain a performance boost when they adopt SAM as a starting point.

Related papers

Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information. We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting. Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z)
Multitask Learning with No Regret: from Improved Confidence Bounds to Active Learning [79.07658065326592]
Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning. We provide novel multitask confidence intervals in the challenging setting when neither the similarity between tasks nor the tasks' features are available to the learner. We propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance.
arXiv Detail & Related papers (2023-08-03T13:08:09Z)
Online Continual Learning via the Knowledge Invariant and Spread-out Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting. We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP) We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z)
Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle. We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths. Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z)
TAME: Task Agnostic Continual Learning using Multiple Experts [9.89894348908034]
This paper focuses on a so-called task-agnostic setting where the task identities are not known and the learning machine needs to infer them from the observations. Our algorithm, which we call TAME (Task-Agnostic continual learning using Multiple Experts), automatically detects the shift in data distributions and switches between task expert networks in an online manner. Our experimental results show the efficacy of our approach on benchmark continual learning data sets, outperforming the previous task-agnostic methods and even the techniques that admit task identities at both training and testing.
arXiv Detail & Related papers (2022-10-08T01:23:30Z)
Autonomous Open-Ended Learning of Tasks with Non-Stationary Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals. While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks. In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture. Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z)
Center Loss Regularization for Continual Learning [0.0]
In general, neural networks lack the ability to learn different tasks sequentially. Our approach remembers old tasks by projecting the representations of new tasks close to that of old tasks. We demonstrate that our approach is scalable, effective, and gives competitive performance compared to state-of-the-art continual learning methods.
arXiv Detail & Related papers (2021-10-21T17:46:44Z)
Continual Learning via Bit-Level Information Preserving [88.32450740325005]
We study the continual learning process through the lens of information theory. We propose Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters. BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning.
arXiv Detail & Related papers (2021-05-10T15:09:01Z)
Learning Invariant Representation for Continual Learning [5.979373021392084]
A key challenge in Continual learning is catastrophically forgetting previously learned tasks when the agent faces a new one. We propose a new pseudo-rehearsal-based method, named learning Invariant Representation for Continual Learning (IRCL) Disentangling the shared invariant representation helps to learn continually a sequence of tasks, while being more robust to forgetting and having better knowledge transfer.
arXiv Detail & Related papers (2021-01-15T15:12:51Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features. Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.