Related papers: Robust Continual Learning through a Comprehensively Progressive Bayesian Neural Network

Robust Continual Learning through a Comprehensively Progressive Bayesian Neural Network

URL: http://arxiv.org/abs/2202.13369v1
Date: Sun, 27 Feb 2022 14:19:50 GMT
Title: Robust Continual Learning through a Comprehensively Progressive Bayesian Neural Network
Authors: Guo Yang, Cheryl Sze Yin Wong and Ramasamy Savitha
Abstract summary: This work proposes a comprehensively progressive Bayesian neural network for robust continual learning of a sequence of tasks. It starts with the contention that similar tasks should have the same number of total network resources, to ensure fair representation of all tasks. The weights that are redundant at the end of training each task are also pruned through re-initialization, in order to be efficiently utilized in the subsequent task.
Score: 1.4695979686066065
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work proposes a comprehensively progressive Bayesian neural network for robust continual learning of a sequence of tasks. A Bayesian neural network is progressively pruned and grown such that there are sufficient network resources to represent a sequence of tasks, while the network does not explode. It starts with the contention that similar tasks should have the same number of total network resources, to ensure fair representation of all tasks in a continual learning scenario. Thus, as the data for new task streams in, sufficient neurons are added to the network such that the total number of neurons in each layer of the network, including the shared representations with previous tasks and individual task related representation, are equal for all tasks. The weights that are redundant at the end of training each task are also pruned through re-initialization, in order to be efficiently utilized in the subsequent task. Thus, the network grows progressively, but ensures effective utilization of network resources. We refer to our proposed method as 'Robust Continual Learning through a Comprehensively Progressive Bayesian Neural Network (RCL-CPB)' and evaluate the proposed approach on the MNIST data set, under three different continual learning scenarios. Further to this, we evaluate the performance of RCL-CPB on a homogeneous sequence of tasks using split CIFAR100 (20 tasks of 5 classes each), and a heterogeneous sequence of tasks using MNIST, SVHN and CIFAR10 data sets. The demonstrations and the performance results show that the proposed strategies for progressive BNN enable robust continual learning.

Related papers

Algorithm Development in Neural Networks: Insights from the Streaming Parity Task [8.188549368578704]
We study the learning dynamics of neural networks trained on a streaming parity task.<n>We show that, with sufficient finite training experience, RNNs exhibit a phase transition to perfect infinite generalization.<n>Our results disclose one mechanism by which neural networks can generalize infinitely from finite training experience.
arXiv Detail & Related papers (2025-07-14T04:07:43Z)
Optimal Task Order for Continual Learning of Multiple Tasks [3.591122855617648]
Continual learning of multiple tasks remains a major challenge for neural networks. Here, we investigate how task order influences continual learning and propose a strategy for optimizing it. Our work thus presents a generalizable framework for task-order optimization in task-incremental continual learning.
arXiv Detail & Related papers (2025-02-05T16:43:58Z)
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks. Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z)
Diffused Redundancy in Pre-trained Representations [98.55546694886819]
We take a closer look at how features are encoded in pre-trained representations. We find that learned representations in a given layer exhibit a degree of diffuse redundancy. Our findings shed light on the nature of representations learned by pre-trained deep neural networks.
arXiv Detail & Related papers (2023-05-31T21:00:50Z)
Forget-free Continual Learning with Soft-Winning SubNetworks [67.0373924836107]
We investigate two proposed continual learning methods which sequentially learn and select adaptive binary- (WSN) and non-binary Soft-Subnetworks (SoftNet) for each task. WSN and SoftNet jointly learn the regularized model weights and task-adaptive non-binary masks ofworks associated with each task. In Task Incremental Learning (TIL), binary masks spawned per winning ticket are encoded into one N-bit binary digit mask, then compressed using Huffman coding for a sub-linear increase in network capacity to the number of tasks.
arXiv Detail & Related papers (2023-03-27T07:53:23Z)
Continual Learning with Dependency Preserving Hypernetworks [14.102057320661427]
An effective approach to address continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network. We propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency. In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance.
arXiv Detail & Related papers (2022-09-16T04:42:21Z)
The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks [0.0]
We identify a methodology and network representational structure which allows a pruned network to employ previously unused weights to learn subsequent tasks. We show that networks trained using our approaches are able to learn multiple tasks, which may be related or unrelated, in parallel or in sequence without sacrificing performance on any task or exhibiting catastrophic forgetting.
arXiv Detail & Related papers (2022-07-18T15:07:13Z)
PaRT: Parallel Learning Towards Robust and Transparent AI [4.160969852186451]
This paper takes a parallel learning approach for robust and transparent AI. A deep neural network is trained in parallel on multiple tasks, where each task is trained only on a subset of the network resources. We show that the network does indeed use learned knowledge from some tasks in other tasks, through shared representations.
arXiv Detail & Related papers (2022-01-24T09:03:28Z)
Training Networks in Null Space of Feature Covariance for Continual Learning [34.095874368589904]
We propose a novel network training algorithm called Adam-NSCL, which sequentially optimize network parameters in the null space of previous tasks. We apply our approach to training networks for continual learning on benchmark datasets of CIFAR-100 and TinyImageNet.
arXiv Detail & Related papers (2021-03-12T07:21:48Z)
Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points. We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z)
Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs) We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
Semantic Drift Compensation for Class-Incremental Learning [48.749630494026086]
Class-incremental learning of deep networks sequentially increases the number of classes to be classified. We propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars.
arXiv Detail & Related papers (2020-04-01T13:31:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.