Continual Prune-and-Select: Class-incremental learning with specialized
subnetworks
- URL: http://arxiv.org/abs/2208.04952v1
- Date: Tue, 9 Aug 2022 10:49:40 GMT
- Title: Continual Prune-and-Select: Class-incremental learning with specialized
subnetworks
- Authors: Aleksandr Dekhovich, David M.J. Tax, Marcel H.F. Sluiter, Miguel A.
Bessa
- Abstract summary: Continual-Prune-and-Select (CP&S) is capable of sequentially learning 10 tasks from ImageNet-1000 keeping an accuracy around 94% with negligible forgetting.
This is a first-of-its-kind result in class-incremental learning.
- Score: 66.4795381419701
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The human brain is capable of learning tasks sequentially mostly without
forgetting. However, deep neural networks (DNNs) suffer from catastrophic
forgetting when learning one task after another. We address this challenge
considering a class-incremental learning scenario where the DNN sees test data
without knowing the task from which this data originates. During training,
Continual-Prune-and-Select (CP&S) finds a subnetwork within the DNN that is
responsible for solving a given task. Then, during inference, CP&S selects the
correct subnetwork to make predictions for that task. A new task is learned by
training available neuronal connections of the DNN (previously untrained) to
create a new subnetwork by pruning, which can include previously trained
connections belonging to other subnetwork(s) because it does not update shared
connections. This enables to eliminate catastrophic forgetting by creating
specialized regions in the DNN that do not conflict with each other while still
allowing knowledge transfer across them. The CP&S strategy is implemented with
different subnetwork selection strategies, revealing superior performance to
state-of-the-art continual learning methods tested on various datasets
(CIFAR-100, CUB-200-2011, ImageNet-100 and ImageNet-1000). In particular, CP&S
is capable of sequentially learning 10 tasks from ImageNet-1000 keeping an
accuracy around 94% with negligible forgetting, a first-of-its-kind result in
class-incremental learning. To the best of the authors' knowledge, this
represents an improvement in accuracy above 20% when compared to the best
alternative method.
Related papers
- Negotiated Representations to Prevent Forgetting in Machine Learning
Applications [0.0]
Catastrophic forgetting is a significant challenge in the field of machine learning.
We propose a novel method for preventing catastrophic forgetting in machine learning applications.
arXiv Detail & Related papers (2023-11-30T22:43:50Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - An Exact Mapping From ReLU Networks to Spiking Neural Networks [3.1701886344065255]
We propose an exact mapping from a network with Rectified Linear Units (ReLUs) to an SNN that fires exactly one spike per neuron.
More generally our work shows that an arbitrary deep ReLU network can be replaced by an energy-efficient single-spike neural network without any loss of performance.
arXiv Detail & Related papers (2022-12-23T18:31:09Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Making a Spiking Net Work: Robust brain-like unsupervised machine
learning [0.0]
Spiking Neural Networks (SNNs) are an alternative to Artificial Neural Networks (ANNs)
SNNs struggle with dynamical stability and cannot match the accuracy of ANNs.
We show how an SNN can overcome many of the shortcomings that have been identified in the literature.
arXiv Detail & Related papers (2022-08-02T02:10:00Z) - Increasing Depth of Neural Networks for Life-long Learning [2.0305676256390934]
We propose a novel method for continual learning based on the increasing depth of neural networks.
This work explores whether extending neural network depth may be beneficial in a life-long learning setting.
arXiv Detail & Related papers (2022-02-22T11:21:41Z) - Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike
Timing Dependent Backpropagation [10.972663738092063]
Spiking Neural Networks (SNNs) operate with asynchronous discrete events (or spikes)
We present a computationally-efficient training technique for deep SNNs.
We achieve top-1 accuracy of 65.19% for ImageNet dataset on SNN with 250 time steps, which is 10X faster compared to converted SNNs with similar accuracy.
arXiv Detail & Related papers (2020-05-04T19:30:43Z) - Semantic Drift Compensation for Class-Incremental Learning [48.749630494026086]
Class-incremental learning of deep networks sequentially increases the number of classes to be classified.
We propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars.
arXiv Detail & Related papers (2020-04-01T13:31:19Z) - iTAML: An Incremental Task-Agnostic Meta-learning Approach [123.10294801296926]
Humans can continuously learn new knowledge as their experience grows.
Previous learning in deep neural networks can quickly fade out when they are trained on a new task.
We introduce a novel meta-learning approach that seeks to maintain an equilibrium between all encountered tasks.
arXiv Detail & Related papers (2020-03-25T21:42:48Z) - Side-Tuning: A Baseline for Network Adaptation via Additive Side
Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network.
In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.