Self-Attention Meta-Learner for Continual Learning
- URL: http://arxiv.org/abs/2101.12136v1
- Date: Thu, 28 Jan 2021 17:35:04 GMT
- Title: Self-Attention Meta-Learner for Continual Learning
- Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
- Abstract summary: Self-Attention Meta-Learner (SAM) learns a prior knowledge for continual learning that permits learning a sequence of tasks.
SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task.
We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task inference.
- Score: 5.979373021392084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning aims to provide intelligent agents capable of learning
multiple tasks sequentially with neural networks. One of its main challenging,
catastrophic forgetting, is caused by the neural networks non-optimal ability
to learn in non-stationary distributions. In most settings of the current
approaches, the agent starts from randomly initialized parameters and is
optimized to master the current task regardless of the usefulness of the
learned representation for future tasks. Moreover, each of the future tasks
uses all the previously learned knowledge although parts of this knowledge
might not be helpful for its learning. These cause interference among tasks,
especially when the data of previous tasks is not accessible. In this paper, we
propose a new method, named Self-Attention Meta-Learner (SAM), which learns a
prior knowledge for continual learning that permits learning a sequence of
tasks, while avoiding catastrophic forgetting. SAM incorporates an attention
mechanism that learns to select the particular relevant representation for each
future task. Each task builds a specific representation branch on top of the
selected knowledge, avoiding the interference between tasks. We evaluate the
proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the
task agnostic inference. We empirically show that we can achieve a better
performance than several state-of-the-art methods for continual learning by
building on the top of selected representation learned by SAM. We also show the
role of the meta-attention mechanism in boosting informative features
corresponding to the input data and identifying the correct target in the task
agnostic inference. Finally, we demonstrate that popular existing continual
learning methods gain a performance boost when they adopt SAM as a starting
point.
Related papers
- Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Multitask Learning with No Regret: from Improved Confidence Bounds to
Active Learning [79.07658065326592]
Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning.
We provide novel multitask confidence intervals in the challenging setting when neither the similarity between tasks nor the tasks' features are available to the learner.
We propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance.
arXiv Detail & Related papers (2023-08-03T13:08:09Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - TAME: Task Agnostic Continual Learning using Multiple Experts [9.89894348908034]
This paper focuses on a so-called task-agnostic setting where the task identities are not known and the learning machine needs to infer them from the observations.
Our algorithm, which we call TAME (Task-Agnostic continual learning using Multiple Experts), automatically detects the shift in data distributions and switches between task expert networks in an online manner.
Our experimental results show the efficacy of our approach on benchmark continual learning data sets, outperforming the previous task-agnostic methods and even the techniques that admit task identities at both training and testing.
arXiv Detail & Related papers (2022-10-08T01:23:30Z) - Autonomous Open-Ended Learning of Tasks with Non-Stationary
Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals.
While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks.
In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture.
Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z) - Center Loss Regularization for Continual Learning [0.0]
In general, neural networks lack the ability to learn different tasks sequentially.
Our approach remembers old tasks by projecting the representations of new tasks close to that of old tasks.
We demonstrate that our approach is scalable, effective, and gives competitive performance compared to state-of-the-art continual learning methods.
arXiv Detail & Related papers (2021-10-21T17:46:44Z) - Continual Learning via Bit-Level Information Preserving [88.32450740325005]
We study the continual learning process through the lens of information theory.
We propose Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters.
BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning.
arXiv Detail & Related papers (2021-05-10T15:09:01Z) - Learning Invariant Representation for Continual Learning [5.979373021392084]
A key challenge in Continual learning is catastrophically forgetting previously learned tasks when the agent faces a new one.
We propose a new pseudo-rehearsal-based method, named learning Invariant Representation for Continual Learning (IRCL)
Disentangling the shared invariant representation helps to learn continually a sequence of tasks, while being more robust to forgetting and having better knowledge transfer.
arXiv Detail & Related papers (2021-01-15T15:12:51Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features.
Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.