Gradient Episodic Memory with a Soft Constraint for Continual Learning
- URL: http://arxiv.org/abs/2011.07801v1
- Date: Mon, 16 Nov 2020 09:06:09 GMT
- Title: Gradient Episodic Memory with a Soft Constraint for Continual Learning
- Authors: Guannan Hu, Wu Zhang, Hu Ding, Wenhao Zhu
- Abstract summary: Catastrophic forgetting is the fatal shortcoming of a large decrease in performance on previous tasks when the model is learning a novel task.
We propose an average gradient episodic memory (A-GEM) with a soft constraint $epsilon in [0, 1]$, which is a balance factor between learning new knowledge and preserving learned knowledge.
$epsilon$-SOFT-GEM outperforms A-GEM and several continual learning benchmarks in a single training epoch.
- Score: 9.52644009921388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Catastrophic forgetting in continual learning is a common destructive
phenomenon in gradient-based neural networks that learn sequential tasks, and
it is much different from forgetting in humans, who can learn and accumulate
knowledge throughout their whole lives. Catastrophic forgetting is the fatal
shortcoming of a large decrease in performance on previous tasks when the model
is learning a novel task. To alleviate this problem, the model should have the
capacity to learn new knowledge and preserve learned knowledge. We propose an
average gradient episodic memory (A-GEM) with a soft constraint $\epsilon \in
[0, 1]$, which is a balance factor between learning new knowledge and
preserving learned knowledge; our method is called gradient episodic memory
with a soft constraint $\epsilon$ ($\epsilon$-SOFT-GEM). $\epsilon$-SOFT-GEM
outperforms A-GEM and several continual learning benchmarks in a single
training epoch; additionally, it has state-of-the-art average accuracy and
efficiency for computation and memory, like A-GEM, and provides a better
trade-off between the stability of preserving learned knowledge and the
plasticity of learning new knowledge.
Related papers
- Fine-Grained Gradient Restriction: A Simple Approach for Mitigating Catastrophic Forgetting [41.891312602770746]
Gradient Episodic Memory (GEM) achieves balance by utilizing a subset of past training samples to restrict the update direction of the model parameters.
We show that memory strength is effective mainly because it improves GEM's ability generalization and therefore leads to a more favorable trade-off.
arXiv Detail & Related papers (2024-10-01T17:03:56Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Adaptively Integrated Knowledge Distillation and Prediction Uncertainty
for Continual Learning [71.43841235954453]
Current deep learning models often suffer from catastrophic forgetting of old knowledge when continually learning new knowledge.
Existing strategies to alleviate this issue often fix the trade-off between keeping old knowledge (stability) and learning new knowledge (plasticity)
arXiv Detail & Related papers (2023-01-18T05:36:06Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - Continual learning of quantum state classification with gradient
episodic memory [0.20646127669654826]
A phenomenon called catastrophic forgetting emerges when a machine learning model is trained across multiple tasks.
Some continual learning strategies have been proposed to address the catastrophic forgetting problem.
In this work, we incorporate the gradient episodic memory method to train a variational quantum classifier.
arXiv Detail & Related papers (2022-03-26T09:28:26Z) - Learning Fast, Learning Slow: A General Continual Learning Method based
on Complementary Learning System [13.041607703862724]
We propose CLS-ER, a novel dual memory experience replay (ER) method.
New knowledge is acquired while aligning the decision boundaries with the semantic memories.
Our approach achieves state-of-the-art performance on standard benchmarks.
arXiv Detail & Related papers (2022-01-29T15:15:23Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - Schematic Memory Persistence and Transience for Efficient and Robust
Continual Learning [8.030924531643532]
Continual learning is considered a promising step towards next-generation Artificial Intelligence (AI)
It is still quite primitive, with existing works focusing primarily on avoiding (catastrophic) forgetting.
We propose a novel framework for continual learning with external memory that builds on recent advances in neuroscience.
arXiv Detail & Related papers (2021-05-05T14:32:47Z) - Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks.
ANV plays as an implicit regularizer of the mutual information between the training data and the learned model.
It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z) - Self-Supervised Learning Aided Class-Incremental Lifelong Learning [17.151579393716958]
We study the issue of catastrophic forgetting in class-incremental learning (Class-IL)
In training procedure of Class-IL, as the model has no knowledge about following tasks, it would only extract features necessary for tasks learned so far, whose information is insufficient for joint classification.
We propose to combine self-supervised learning, which can provide effective representations without requiring labels, with Class-IL to partly get around this problem.
arXiv Detail & Related papers (2020-06-10T15:15:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.