Enabling Continual Learning with Differentiable Hebbian Plasticity
- URL: http://arxiv.org/abs/2006.16558v1
- Date: Tue, 30 Jun 2020 06:42:19 GMT
- Title: Enabling Continual Learning with Differentiable Hebbian Plasticity
- Authors: Vithursan Thangarasa, Thomas Miconi, Graham W. Taylor
- Abstract summary: Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge.
catastrophic forgetting poses a grand challenge for neural networks performing such learning process.
We propose a Differentiable Hebbian Consolidation model which is composed of a Differentiable Hebbian Plasticity.
- Score: 18.12749708143404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning is the problem of sequentially learning new tasks or
knowledge while protecting previously acquired knowledge. However, catastrophic
forgetting poses a grand challenge for neural networks performing such learning
process. Thus, neural networks that are deployed in the real world often
struggle in scenarios where the data distribution is non-stationary (concept
drift), imbalanced, or not always fully available, i.e., rare edge cases. We
propose a Differentiable Hebbian Consolidation model which is composed of a
Differentiable Hebbian Plasticity (DHP) Softmax layer that adds a rapid
learning plastic component (compressed episodic memory) to the fixed (slow
changing) parameters of the softmax output layer; enabling learned
representations to be retained for a longer timescale. We demonstrate the
flexibility of our method by integrating well-known task-specific synaptic
consolidation methods to penalize changes in the slow weights that are
important for each target task. We evaluate our approach on the Permuted MNIST,
Split MNIST and Vision Datasets Mixture benchmarks, and introduce an imbalanced
variant of Permuted MNIST -- a dataset that combines the challenges of class
imbalance and concept drift. Our proposed model requires no additional
hyperparameters and outperforms comparable baselines by reducing forgetting.
Related papers
- SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Neuromimetic metaplasticity for adaptive continual learning [2.1749194587826026]
We propose a metaplasticity model inspired by human working memory to achieve catastrophic forgetting-free continual learning.
A key aspect of our approach involves implementing distinct types of synapses from stable to flexible, and randomly intermixing them to train synaptic connections with different degrees of flexibility.
The model achieved a balanced tradeoff between memory capacity and performance without requiring additional training or structural modifications.
arXiv Detail & Related papers (2024-07-09T12:21:35Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Switching Autoregressive Low-rank Tensor Models [12.461139675114818]
We show how to switch autoregressive low-rank tensor (SALT) models.
SALT parameterizes the tensor of an ARHMM with a low-rank factorization to control the number of parameters.
We prove theoretical and discuss practical connections between SALT, linear dynamical systems, and SLDSs.
arXiv Detail & Related papers (2023-06-05T22:25:28Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Neuromodulated Neural Architectures with Local Error Signals for
Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation.
We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.