Continual Normalization: Rethinking Batch Normalization for Online
Continual Learning
- URL: http://arxiv.org/abs/2203.16102v1
- Date: Wed, 30 Mar 2022 07:23:24 GMT
- Title: Continual Normalization: Rethinking Batch Normalization for Online
Continual Learning
- Authors: Quang Pham, Chenghao Liu, Steven Hoi
- Abstract summary: We study the cross-task normalization effect of Batch Normalization (BN) in online continual learning.
BN normalizes the testing data using moments biased towards the current task, resulting in higher catastrophic forgetting.
We propose Continual Normalization (CN) to facilitate training similar to BN while mitigating its negative effect.
- Score: 21.607816915609128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing continual learning methods use Batch Normalization (BN) to
facilitate training and improve generalization across tasks. However, the
non-i.i.d and non-stationary nature of continual learning data, especially in
the online setting, amplify the discrepancy between training and testing in BN
and hinder the performance of older tasks. In this work, we study the
cross-task normalization effect of BN in online continual learning where BN
normalizes the testing data using moments biased towards the current task,
resulting in higher catastrophic forgetting. This limitation motivates us to
propose a simple yet effective method that we call Continual Normalization (CN)
to facilitate training similar to BN while mitigating its negative effect.
Extensive experiments on different continual learning algorithms and online
scenarios show that CN is a direct replacement for BN and can provide
substantial performance improvements. Our implementation is available at
\url{https://github.com/phquang/Continual-Normalization}.
Related papers
- Unified Batch Normalization: Identifying and Alleviating the Feature
Condensation in Batch Normalization and a Unified Framework [55.22949690864962]
Batch Normalization (BN) has become an essential technique in contemporary neural network design.
We propose a two-stage unified framework called Unified Batch Normalization (UBN)
UBN significantly enhances performance across different visual backbones and different vision tasks.
arXiv Detail & Related papers (2023-11-27T16:41:31Z) - Overcoming Recency Bias of Normalization Statistics in Continual
Learning: Balance and Adaptation [67.77048565738728]
Continual learning involves learning a sequence of tasks and balancing their knowledge appropriately.
We propose Adaptive Balance of BN (AdaB$2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions.
Our approach achieves significant performance gains across a wide range of benchmarks.
arXiv Detail & Related papers (2023-10-13T04:50:40Z) - An Adaptive Batch Normalization in Deep Learning [0.0]
Batch Normalization (BN) is a way to accelerate and stabilize training in deep convolutional neural networks.
We propose a threshold-based adaptive BN approach that separates the data that requires the BN and data that does not require it.
arXiv Detail & Related papers (2022-11-03T12:12:56Z) - Understanding the Failure of Batch Normalization for Transformers in NLP [16.476194435004732]
Batch Normalization (BN) is a technique to accelerate the training of deep neural networks.
BN fails to defend its position in Natural Language Processing (NLP), which is dominated by Layer Normalization (LN)
Regularized BN (RBN) improves the performance of BN consistently and outperforms or is on par with LN on 17 out of 20 settings.
arXiv Detail & Related papers (2022-10-11T05:18:47Z) - Rebalancing Batch Normalization for Exemplar-based Class-Incremental
Learning [23.621259845287824]
Batch Normalization (BN) has been extensively studied for neural nets in various computer vision tasks.
We develop a new update patch for BN, particularly tailored for the exemplar-based class-incremental learning (CIL)
arXiv Detail & Related papers (2022-01-29T11:03:03Z) - Batch Normalization Preconditioning for Neural Network Training [7.709342743709842]
Batch normalization (BN) is a popular and ubiquitous method in deep learning.
BN is not suitable for use with very small mini-batch sizes or online learning.
We propose a new method called Batch Normalization Preconditioning (BNP)
arXiv Detail & Related papers (2021-08-02T18:17:26Z) - Double Forward Propagation for Memorized Batch Normalization [68.34268180871416]
Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs)
We propose a memorized batch normalization (MBN) which considers multiple recent batches to obtain more accurate and robust statistics.
Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference.
arXiv Detail & Related papers (2020-10-10T08:48:41Z) - Bilevel Continual Learning [76.50127663309604]
We present a novel framework of continual learning named "Bilevel Continual Learning" (BCL)
Our experiments on continual learning benchmarks demonstrate the efficacy of the proposed BCL compared to many state-of-the-art methods.
arXiv Detail & Related papers (2020-07-30T16:00:23Z) - PowerNorm: Rethinking Batch Normalization in Transformers [96.14956636022957]
normalization method for neural network (NN) models used in Natural Language Processing (NLP) is layer normalization (LN)
LN is preferred due to the empirical observation that a (naive/vanilla) use of BN leads to significant performance degradation for NLP tasks.
We propose Power Normalization (PN), a novel normalization scheme that resolves this issue.
arXiv Detail & Related papers (2020-03-17T17:50:26Z) - Towards Stabilizing Batch Statistics in Backward Propagation of Batch
Normalization [126.6252371899064]
Moving Average Batch Normalization (MABN) is a novel normalization method.
We show that MABN can completely restore the performance of vanilla BN in small batch cases.
Our experiments demonstrate the effectiveness of MABN in multiple computer vision tasks including ImageNet and COCO.
arXiv Detail & Related papers (2020-01-19T14:41:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.