Residual Continual Learning
- URL: http://arxiv.org/abs/2002.06774v1
- Date: Mon, 17 Feb 2020 05:24:45 GMT
- Title: Residual Continual Learning
- Authors: Janghyeon Lee, Donggyu Joo, Hyeong Gwon Hong, Junmo Kim
- Abstract summary: We propose a novel continual learning method called Residual Continual Learning (ResCL)
Our method can prevent the catastrophic forgetting phenomenon in sequential learning of multiple tasks, without any source task information except the original network.
The proposed method exhibits state-of-the-art performance in various continual learning scenarios.
- Score: 33.442903467864966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel continual learning method called Residual Continual
Learning (ResCL). Our method can prevent the catastrophic forgetting phenomenon
in sequential learning of multiple tasks, without any source task information
except the original network. ResCL reparameterizes network parameters by
linearly combining each layer of the original network and a fine-tuned network;
therefore, the size of the network does not increase at all. To apply the
proposed method to general convolutional neural networks, the effects of batch
normalization layers are also considered. By utilizing residual-learning-like
reparameterization and a special weight decay loss, the trade-off between
source and target performance is effectively controlled. The proposed method
exhibits state-of-the-art performance in various continual learning scenarios.
Related papers
- Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Continual Learning with Dependency Preserving Hypernetworks [14.102057320661427]
An effective approach to address continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network.
We propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency.
In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance.
arXiv Detail & Related papers (2022-09-16T04:42:21Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Clustering-Based Interpretation of Deep ReLU Network [17.234442722611803]
We recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering.
We propose a method to increase the level of interpretability of a fully connected feedforward ReLU neural network.
arXiv Detail & Related papers (2021-10-13T09:24:11Z) - All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network.
Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student.
To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z) - Self-Reorganizing and Rejuvenating CNNs for Increasing Model Capacity
Utilization [8.661269034961679]
We propose a biologically inspired method for improving the computational resource utilization of neural networks.
The proposed method utilizes the channel activations of a convolution layer in order to reorganize that layers parameters.
The rejuvenated parameters learn different features to supplement those learned by the reorganized surviving parameters.
arXiv Detail & Related papers (2021-02-13T06:19:45Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Implicit recurrent networks: A novel approach to stationary input
processing with recurrent neural networks in deep learning [0.0]
In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning.
We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks.
A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
arXiv Detail & Related papers (2020-10-20T18:55:32Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - Exploiting Non-Linear Redundancy for Neural Model Compression [26.211513643079993]
We propose a novel model compression approach based on exploitation of linear dependence.
Our method results in a reduction of up to 99% in overall network size with small loss in performance.
arXiv Detail & Related papers (2020-05-28T15:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.