Related papers: Cross-Iteration Batch Normalization

Cross-Iteration Batch Normalization

URL: http://arxiv.org/abs/2002.05712v3
Date: Thu, 25 Mar 2021 06:57:36 GMT
Title: Cross-Iteration Batch Normalization
Authors: Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin
Abstract summary: We present Cross-It Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality. CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique.
Score: 67.83430009388678
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A well-known issue of Batch Normalization is its significantly reduced effectiveness in the case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon which the normalization is defined cannot be reliably estimated from it during a training iteration. To address this problem, we present Cross-Iteration Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality. A challenge of computing statistics over multiple iterations is that the network activations from different iterations are not comparable to each other due to changes in network weights. We thus compensate for the network weight changes via a proposed technique based on Taylor polynomials, so that the statistics can be accurately estimated and batch normalization can be effectively applied. On object detection and image classification with small mini-batch sizes, CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique. Code is available at https://github.com/Howal/Cross-iterationBatchNorm .

Related papers

Exploring the Efficacy of Group-Normalization in Deep Learning Models for Alzheimer's Disease Classification [2.6447365674762273]
Group Normalization is an easy alternative to Batch Normalization. GN achieves a very low error rate of 10.6% compared to Batch Normalization.
arXiv Detail & Related papers (2024-04-01T06:10:11Z)
Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting [13.270381125055275]
We propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight. We develop an improved KL metric to determine optimal quantization scales for activation. For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss.
arXiv Detail & Related papers (2023-12-17T02:31:20Z)
Patch-aware Batch Normalization for Improving Cross-domain Robustness [55.06956781674986]
Cross-domain tasks present a challenge in which the model's performance will degrade when the training set and the test set follow different distributions. We propose a novel method called patch-aware batch normalization (PBN) By exploiting the differences between local patches of an image, our proposed PBN can effectively enhance the robustness of the model's parameters.
arXiv Detail & Related papers (2023-04-06T03:25:42Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
Variance-Aware Weight Initialization for Point Convolutional Neural Networks [23.46612653627991]
We propose a framework to unify the multitude of continuous convolutions. We show that this framework can avoid batch normalization while achieving similar and, in some cases, better performance.
arXiv Detail & Related papers (2021-12-07T15:47:14Z)
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks. DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons. We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z)
Comparing Normalization Methods for Limited Batch Size Segmentation Neural Networks [0.0]
Batch Normalization works best using large batch size during training. We show the effectiveness of Instance Normalization in the limited batch size neural network training environment. We also show that the Instance Normalization implementation used in this experiment is computational time efficient when compared to the network without any normalization method.
arXiv Detail & Related papers (2020-11-23T17:13:24Z)
WeightAlign: Normalizing Activations by Weight Alignment [16.85286948260155]
Batch normalization (BN) allows training very deep networks by normalizing activations by mini-batch sample statistics. Such methods are less stable than BN as they critically depend on the statistics of a single input sample. We present WeightAlign: a method that normalizes the weights by the mean and scaled standard derivation computed within a filter, which normalizes activations without computing any sample statistics.
arXiv Detail & Related papers (2020-10-14T15:25:39Z)
Double Forward Propagation for Memorized Batch Normalization [68.34268180871416]
Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs) We propose a memorized batch normalization (MBN) which considers multiple recent batches to obtain more accurate and robust statistics. Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference.
arXiv Detail & Related papers (2020-10-10T08:48:41Z)
Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization [126.6252371899064]
Moving Average Batch Normalization (MABN) is a novel normalization method. We show that MABN can completely restore the performance of vanilla BN in small batch cases. Our experiments demonstrate the effectiveness of MABN in multiple computer vision tasks including ImageNet and COCO.
arXiv Detail & Related papers (2020-01-19T14:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.