Robust Continual Learning through a Comprehensively Progressive Bayesian
Neural Network
- URL: http://arxiv.org/abs/2202.13369v1
- Date: Sun, 27 Feb 2022 14:19:50 GMT
- Title: Robust Continual Learning through a Comprehensively Progressive Bayesian
Neural Network
- Authors: Guo Yang, Cheryl Sze Yin Wong and Ramasamy Savitha
- Abstract summary: This work proposes a comprehensively progressive Bayesian neural network for robust continual learning of a sequence of tasks.
It starts with the contention that similar tasks should have the same number of total network resources, to ensure fair representation of all tasks.
The weights that are redundant at the end of training each task are also pruned through re-initialization, in order to be efficiently utilized in the subsequent task.
- Score: 1.4695979686066065
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work proposes a comprehensively progressive Bayesian neural network for
robust continual learning of a sequence of tasks. A Bayesian neural network is
progressively pruned and grown such that there are sufficient network resources
to represent a sequence of tasks, while the network does not explode. It starts
with the contention that similar tasks should have the same number of total
network resources, to ensure fair representation of all tasks in a continual
learning scenario. Thus, as the data for new task streams in, sufficient
neurons are added to the network such that the total number of neurons in each
layer of the network, including the shared representations with previous tasks
and individual task related representation, are equal for all tasks. The
weights that are redundant at the end of training each task are also pruned
through re-initialization, in order to be efficiently utilized in the
subsequent task. Thus, the network grows progressively, but ensures effective
utilization of network resources. We refer to our proposed method as 'Robust
Continual Learning through a Comprehensively Progressive Bayesian Neural
Network (RCL-CPB)' and evaluate the proposed approach on the MNIST data set,
under three different continual learning scenarios. Further to this, we
evaluate the performance of RCL-CPB on a homogeneous sequence of tasks using
split CIFAR100 (20 tasks of 5 classes each), and a heterogeneous sequence of
tasks using MNIST, SVHN and CIFAR10 data sets. The demonstrations and the
performance results show that the proposed strategies for progressive BNN
enable robust continual learning.
Related papers
- Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Diffused Redundancy in Pre-trained Representations [98.55546694886819]
We take a closer look at how features are encoded in pre-trained representations.
We find that learned representations in a given layer exhibit a degree of diffuse redundancy.
Our findings shed light on the nature of representations learned by pre-trained deep neural networks.
arXiv Detail & Related papers (2023-05-31T21:00:50Z) - Forget-free Continual Learning with Soft-Winning SubNetworks [67.0373924836107]
We investigate two proposed continual learning methods which sequentially learn and select adaptive binary- (WSN) and non-binary Soft-Subnetworks (SoftNet) for each task.
WSN and SoftNet jointly learn the regularized model weights and task-adaptive non-binary masks ofworks associated with each task.
In Task Incremental Learning (TIL), binary masks spawned per winning ticket are encoded into one N-bit binary digit mask, then compressed using Huffman coding for a sub-linear increase in network capacity to the number of tasks.
arXiv Detail & Related papers (2023-03-27T07:53:23Z) - Continual Learning with Dependency Preserving Hypernetworks [14.102057320661427]
An effective approach to address continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network.
We propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency.
In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance.
arXiv Detail & Related papers (2022-09-16T04:42:21Z) - The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by
Isolating Task-Specific Subnetworks in Feedforward Neural Networks [0.0]
We identify a methodology and network representational structure which allows a pruned network to employ previously unused weights to learn subsequent tasks.
We show that networks trained using our approaches are able to learn multiple tasks, which may be related or unrelated, in parallel or in sequence without sacrificing performance on any task or exhibiting catastrophic forgetting.
arXiv Detail & Related papers (2022-07-18T15:07:13Z) - PaRT: Parallel Learning Towards Robust and Transparent AI [4.160969852186451]
This paper takes a parallel learning approach for robust and transparent AI.
A deep neural network is trained in parallel on multiple tasks, where each task is trained only on a subset of the network resources.
We show that the network does indeed use learned knowledge from some tasks in other tasks, through shared representations.
arXiv Detail & Related papers (2022-01-24T09:03:28Z) - Training Networks in Null Space of Feature Covariance for Continual
Learning [34.095874368589904]
We propose a novel network training algorithm called Adam-NSCL, which sequentially optimize network parameters in the null space of previous tasks.
We apply our approach to training networks for continual learning on benchmark datasets of CIFAR-100 and TinyImageNet.
arXiv Detail & Related papers (2021-03-12T07:21:48Z) - Graph-Based Neural Network Models with Multiple Self-Supervised
Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points.
We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - Semantic Drift Compensation for Class-Incremental Learning [48.749630494026086]
Class-incremental learning of deep networks sequentially increases the number of classes to be classified.
We propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars.
arXiv Detail & Related papers (2020-04-01T13:31:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.