An Investigation into the Stochasticity of Batch Whitening
- URL: http://arxiv.org/abs/2003.12327v1
- Date: Fri, 27 Mar 2020 11:06:32 GMT
- Title: An Investigation into the Stochasticity of Batch Whitening
- Authors: Lei Huang, Lei Zhao, Yi Zhou, Fan Zhu, Li Liu, Ling Shao
- Abstract summary: This paper investigates the more general Batch Whitening (BW) operation.
We show that while various whitening transformations equivalently improve the conditioning, they show significantly different behaviors in discriminative scenarios and training Generative Adrial Networks (GAN)
Our proposed BW algorithm improves the residual networks by a significant margin on ImageNetversaity.
- Score: 95.54842420166862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Batch Normalization (BN) is extensively employed in various network
architectures by performing standardization within mini-batches.
A full understanding of the process has been a central target in the deep
learning communities.
Unlike existing works, which usually only analyze the standardization
operation, this paper investigates the more general Batch Whitening (BW). Our
work originates from the observation that while various whitening
transformations equivalently improve the conditioning, they show significantly
different behaviors in discriminative scenarios and training Generative
Adversarial Networks (GANs).
We attribute this phenomenon to the stochasticity that BW introduces.
We quantitatively investigate the stochasticity of different whitening
transformations and show that it correlates well with the optimization
behaviors during training.
We also investigate how stochasticity relates to the estimation of population
statistics during inference.
Based on our analysis, we provide a framework for designing and comparing BW
algorithms in different scenarios.
Our proposed BW algorithm improves the residual networks by a significant
margin on ImageNet classification.
Besides, we show that the stochasticity of BW can improve the GAN's
performance with, however, the sacrifice of the training stability.
Related papers
- Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing [55.791818510796645]
We aim to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data.
Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge.
We adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain.
arXiv Detail & Related papers (2024-10-08T12:26:48Z) - Covariance-corrected Whitening Alleviates Network Degeneration on Imbalanced Classification [6.197116272789107]
Class imbalance is a critical issue in image classification that significantly affects the performance of deep recognition models.
We propose a novel framework called Whitening-Net to mitigate the degenerate solutions.
In scenarios with extreme class imbalance, the batch covariance statistic exhibits significant fluctuations, impeding the convergence of the whitening operation.
arXiv Detail & Related papers (2024-08-30T10:49:33Z) - Unified Batch Normalization: Identifying and Alleviating the Feature
Condensation in Batch Normalization and a Unified Framework [55.22949690864962]
Batch Normalization (BN) has become an essential technique in contemporary neural network design.
We propose a two-stage unified framework called Unified Batch Normalization (UBN)
UBN significantly enhances performance across different visual backbones and different vision tasks.
arXiv Detail & Related papers (2023-11-27T16:41:31Z) - Overcoming Recency Bias of Normalization Statistics in Continual
Learning: Balance and Adaptation [67.77048565738728]
Continual learning involves learning a sequence of tasks and balancing their knowledge appropriately.
We propose Adaptive Balance of BN (AdaB$2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions.
Our approach achieves significant performance gains across a wide range of benchmarks.
arXiv Detail & Related papers (2023-10-13T04:50:40Z) - Gated Information Bottleneck for Generalization in Sequential
Environments [13.795129636387623]
Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set.
We propose a new neural network-based IB approach, termed gated information bottleneck (GIB)
We empirically demonstrate the superiority of GIB over other popular neural network-based IB approaches in adversarial robustness and out-of-distribution detection.
arXiv Detail & Related papers (2021-10-12T14:58:38Z) - Test-time Batch Statistics Calibration for Covariate Shift [66.7044675981449]
We propose to adapt the deep models to the novel environment during inference.
We present a general formulation $alpha$-BN to calibrate the batch statistics.
We also present a novel loss function to form a unified test time adaptation framework Core.
arXiv Detail & Related papers (2021-10-06T08:45:03Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - More Is More -- Narrowing the Generalization Gap by Adding
Classification Heads [8.883733362171032]
We introduce an architecture enhancement for existing neural network models based on input transformations, termed 'TransNet'
Our model can be employed during training time only and then pruned for prediction, resulting in an equivalent architecture to the base model.
arXiv Detail & Related papers (2021-02-09T16:30:33Z) - Unbiased Deep Reinforcement Learning: A General Training Framework for
Existing and Future Algorithms [3.7050607140679026]
We propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning.
We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences.
We propose several algorithms embedded with our new framework to deal with typical discrete and continuous scenarios.
arXiv Detail & Related papers (2020-05-12T01:51:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.