Continual Learning with Dependency Preserving Hypernetworks
- URL: http://arxiv.org/abs/2209.07712v1
- Date: Fri, 16 Sep 2022 04:42:21 GMT
- Title: Continual Learning with Dependency Preserving Hypernetworks
- Authors: Dupati Srikar Chandra, Sakshi Varshney, P.K. Srijith, Sunil Gupta
- Abstract summary: An effective approach to address continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network.
We propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency.
In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance.
- Score: 14.102057320661427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans learn continually throughout their lifespan by accumulating diverse
knowledge and fine-tuning it for future tasks. When presented with a similar
goal, neural networks suffer from catastrophic forgetting if data distributions
across sequential tasks are not stationary over the course of learning. An
effective approach to address such continual learning (CL) problems is to use
hypernetworks which generate task dependent weights for a target network.
However, the continual learning performance of existing hypernetwork based
approaches are affected by the assumption of independence of the weights across
the layers in order to maintain parameter efficiency. To address this
limitation, we propose a novel approach that uses a dependency preserving
hypernetwork to generate weights for the target network while also maintaining
the parameter efficiency. We propose to use recurrent neural network (RNN)
based hypernetwork that can generate layer weights efficiently while allowing
for dependencies across them. In addition, we propose novel regularisation and
network growth techniques for the RNN based hypernetwork to further improve the
continual learning performance. To demonstrate the effectiveness of the
proposed methods, we conducted experiments on several image classification
continual learning tasks and settings. We found that the proposed methods based
on the RNN hypernetworks outperformed the baselines in all these CL settings
and tasks.
Related papers
- Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Negotiated Representations to Prevent Forgetting in Machine Learning
Applications [0.0]
Catastrophic forgetting is a significant challenge in the field of machine learning.
We propose a novel method for preventing catastrophic forgetting in machine learning applications.
arXiv Detail & Related papers (2023-11-30T22:43:50Z) - Robust Continual Learning through a Comprehensively Progressive Bayesian
Neural Network [1.4695979686066065]
This work proposes a comprehensively progressive Bayesian neural network for robust continual learning of a sequence of tasks.
It starts with the contention that similar tasks should have the same number of total network resources, to ensure fair representation of all tasks.
The weights that are redundant at the end of training each task are also pruned through re-initialization, in order to be efficiently utilized in the subsequent task.
arXiv Detail & Related papers (2022-02-27T14:19:50Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Efficient Feature Transformations for Discriminative and Generative
Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - R-FORCE: Robust Learning for Random Recurrent Neural Networks [6.285241353736006]
We propose a robust training method to enhance robustness of RRNN.
FORCE learning approach was shown to be applicable even for the challenging task of target-learning.
Our experiments indicate that R-FORCE facilitates significantly more stable and accurate target-learning for a wide class of RRNN.
arXiv Detail & Related papers (2020-03-25T22:08:03Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z) - Residual Continual Learning [33.442903467864966]
We propose a novel continual learning method called Residual Continual Learning (ResCL)
Our method can prevent the catastrophic forgetting phenomenon in sequential learning of multiple tasks, without any source task information except the original network.
The proposed method exhibits state-of-the-art performance in various continual learning scenarios.
arXiv Detail & Related papers (2020-02-17T05:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.