NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual
Learning in Sparse Networks
- URL: http://arxiv.org/abs/2206.09117v1
- Date: Sat, 18 Jun 2022 04:56:49 GMT
- Title: NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual
Learning in Sparse Networks
- Authors: Mustafa Burak Gurbuz and Constantine Dovrolis
- Abstract summary: NISPA architecture forms stable paths to preserve learned knowledge from older tasks.
NISPA significantly outperforms state-of-the-art continual learning baselines.
- Score: 6.205922305859479
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of continual learning (CL) is to learn different tasks over time.
The main desiderata associated with CL are to maintain performance on older
tasks, leverage the latter to improve learning of future tasks, and to
introduce minimal overhead in the training process (for instance, to not
require a growing model or retraining). We propose the Neuro-Inspired
Stability-Plasticity Adaptation (NISPA) architecture that addresses these
desiderata through a sparse neural network with fixed density. NISPA forms
stable paths to preserve learned knowledge from older tasks. Also, NISPA uses
connection rewiring to create new plastic paths that reuse existing knowledge
on novel tasks. Our extensive evaluation on EMNIST, FashionMNIST, CIFAR10, and
CIFAR100 datasets shows that NISPA significantly outperforms representative
state-of-the-art continual learning baselines, and it uses up to ten times
fewer learnable parameters compared to baselines. We also make the case that
sparsity is an essential ingredient for continual learning. The NISPA code is
available at https://github.com/BurakGurbuz97/NISPA.
Related papers
- Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning [22.13331870720021]
We propose a beyond prompt learning approach to the RFCL task, called Continual Adapter (C-ADA)
C-ADA flexibly extends specific weights in CAL to learn new knowledge for each task and freezes old weights to preserve prior knowledge.
Our approach achieves significantly improved performance and training speed, outperforming the current state-of-the-art (SOTA) method.
arXiv Detail & Related papers (2024-07-14T17:40:40Z) - Improving Representational Continuity via Continued Pretraining [76.29171039601948]
Transfer learning community (LP-FT) outperforms naive training and other continual learning methods.
LP-FT also reduces forgetting in a real world satellite remote sensing dataset (FMoW)
variant of LP-FT gets state-of-the-art accuracies on an NLP continual learning benchmark.
arXiv Detail & Related papers (2023-02-26T10:39:38Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Center Loss Regularization for Continual Learning [0.0]
In general, neural networks lack the ability to learn different tasks sequentially.
Our approach remembers old tasks by projecting the representations of new tasks close to that of old tasks.
We demonstrate that our approach is scalable, effective, and gives competitive performance compared to state-of-the-art continual learning methods.
arXiv Detail & Related papers (2021-10-21T17:46:44Z) - Iterative Network Pruning with Uncertainty Regularization for Lifelong
Sentiment Classification [25.13885692629219]
Lifelong learning is non-trivial for deep neural networks.
We propose a novel iterative network pruning with uncertainty regularization method for lifelong sentiment classification.
arXiv Detail & Related papers (2021-06-21T15:34:13Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks
through Network-Aware Adaptation [57.46377517266827]
This work introduces network-aware adaptive structure transfer learning (N-ASTL)
N-ASTL utilizes statistical information related to the source network's topology and weight distribution to inform how new input and output neurons are to be integrated into the existing structure.
Results show improvements over prior state-of-the-art, including the ability to transfer in challenging real-world datasets not previously possible.
arXiv Detail & Related papers (2020-06-04T06:07:30Z) - Continual Learning Using Multi-view Task Conditional Neural Networks [6.27221711890162]
Conventional deep learning models have limited capacity in learning multiple tasks sequentially.
We propose Multi-view Task Conditional Neural Networks (Mv-TCNN) that does not require to known the reoccurring tasks in advance.
arXiv Detail & Related papers (2020-05-08T01:03:30Z) - iTAML: An Incremental Task-Agnostic Meta-learning Approach [123.10294801296926]
Humans can continuously learn new knowledge as their experience grows.
Previous learning in deep neural networks can quickly fade out when they are trained on a new task.
We introduce a novel meta-learning approach that seeks to maintain an equilibrium between all encountered tasks.
arXiv Detail & Related papers (2020-03-25T21:42:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.