Continual Learning by Modeling Intra-Class Variation
- URL: http://arxiv.org/abs/2210.05398v1
- Date: Tue, 11 Oct 2022 12:17:43 GMT
- Title: Continual Learning by Modeling Intra-Class Variation
- Authors: Longhui Yu, Tianyang Hu, Lanqing Hong, Zhen Liu, Adrian Weller,
Weiyang Liu
- Abstract summary: It has been observed that neural networks perform poorly when the data or tasks are presented sequentially.
Unlike humans, neural networks suffer greatly from catastrophic forgetting, making it impossible to perform life-long learning.
We examine memory-based continual learning and identify that large variation in the representation space is crucial for avoiding catastrophic forgetting.
- Score: 33.30614232534283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been observed that neural networks perform poorly when the data or
tasks are presented sequentially. Unlike humans, neural networks suffer greatly
from catastrophic forgetting, making it impossible to perform life-long
learning. To address this issue, memory-based continual learning has been
actively studied and stands out as one of the best-performing methods. We
examine memory-based continual learning and identify that large variation in
the representation space is crucial for avoiding catastrophic forgetting.
Motivated by this, we propose to diversify representations by using two types
of perturbations: model-agnostic variation (i.e., the variation is generated
without the knowledge of the learned neural network) and model-based variation
(i.e., the variation is conditioned on the learned neural network). We
demonstrate that enlarging representational variation serves as a general
principle to improve continual learning. Finally, we perform empirical studies
which demonstrate that our method, as a simple plug-and-play component, can
consistently improve a number of memory-based continual learning methods by a
large margin.
Related papers
- Neuromimetic metaplasticity for adaptive continual learning [2.1749194587826026]
We propose a metaplasticity model inspired by human working memory to achieve catastrophic forgetting-free continual learning.
A key aspect of our approach involves implementing distinct types of synapses from stable to flexible, and randomly intermixing them to train synaptic connections with different degrees of flexibility.
The model achieved a balanced tradeoff between memory capacity and performance without requiring additional training or structural modifications.
arXiv Detail & Related papers (2024-07-09T12:21:35Z) - A multifidelity approach to continual learning for physical systems [1.4218223473363278]
We introduce a novel continual learning method based on multifidelity deep neural networks.
This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset.
arXiv Detail & Related papers (2023-04-08T03:07:43Z) - Error Sensitivity Modulation based Experience Replay: Mitigating Abrupt
Representation Drift in Continual Learning [13.041607703862724]
We propose ESMER, which employs a principled mechanism to modulate error sensitivity in a dual-memory rehearsal-based system.
ESMER effectively reduces forgetting and abrupt drift in representations at the task boundary by gradually adapting to the new task while consolidating knowledge.
Remarkably, it also enables the model to learn under high levels of label noise, which is ubiquitous in real-world data streams.
arXiv Detail & Related papers (2023-02-14T16:35:54Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - Probing Representation Forgetting in Supervised and Unsupervised
Continual Learning [14.462797749666992]
Catastrophic forgetting is associated with an abrupt loss of knowledge previously learned by a model.
We show that representation forgetting can lead to new insights on the effect of model capacity and loss function used in continual learning.
arXiv Detail & Related papers (2022-03-24T23:06:08Z) - Learning where to learn: Gradient sparsity in meta and continual
learning [4.845285139609619]
We show that meta-learning can be improved by letting the learning algorithm decide which weights to change.
We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis.
Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse gradient descent is a powerful inductive bias for meta-learning systems.
arXiv Detail & Related papers (2021-10-27T12:54:36Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Reducing Representation Drift in Online Continual Learning [87.71558506591937]
We study the online continual learning paradigm, where agents must learn from a changing distribution with constrained memory and compute.
In this work we instead focus on the change in representations of previously observed data due to the introduction of previously unobserved class samples in the incoming data stream.
arXiv Detail & Related papers (2021-04-11T15:19:30Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.