Impact of Batch Normalization on Convolutional Network Representations
- URL: http://arxiv.org/abs/2501.14441v2
- Date: Thu, 13 Feb 2025 14:52:43 GMT
- Title: Impact of Batch Normalization on Convolutional Network Representations
- Authors: Hermanus L. Potgieter, Coenraad Mouton, Marelie H. Davel,
- Abstract summary: Batch normalization (BatchNorm) is a popular layer normalization technique used when training deep neural networks.
We investigate the effect of BatchNorm on the resulting hidden representations, that is, the vectors of activation values formed as samples are processed at each hidden layer.
- Score: 0.5530212768657544
- License:
- Abstract: Batch normalization (BatchNorm) is a popular layer normalization technique used when training deep neural networks. It has been shown to enhance the training speed and accuracy of deep learning models. However, the mechanics by which BatchNorm achieves these benefits is an active area of research, and different perspectives have been proposed. In this paper, we investigate the effect of BatchNorm on the resulting hidden representations, that is, the vectors of activation values formed as samples are processed at each hidden layer. Specifically, we consider the sparsity of these representations, as well as their implicit clustering -- the creation of groups of representations that are similar to some extent. We contrast image classification models trained with and without batch normalization and highlight consistent differences observed. These findings highlight that BatchNorm's effect on representational sparsity is not a significant factor affecting generalization, while the representations of models trained with BatchNorm tend to show more advantageous clustering characteristics.
Related papers
- Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization [23.78498670529746]
We introduce a regularization technique to ensure that the magnitudes of the extracted features are evenly distributed.
Despite its apparent simplicity, our approach has demonstrated significant performance improvements across various fine-grained visual recognition datasets.
arXiv Detail & Related papers (2024-09-03T07:32:46Z) - Spectral regularization for adversarially-robust representation learning [32.84188052937496]
We propose a new spectral regularizer for representation learning that encourages black-box adversarial robustness in downstream classification tasks.
We show that this method is more effective in boosting test accuracy and robustness than previously-proposed methods that regularize all layers of the network.
arXiv Detail & Related papers (2024-05-27T14:01:42Z) - Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons.
The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z) - Manifold Contrastive Learning with Variational Lie Group Operators [5.0741409008225755]
We propose a contrastive learning approach that directly models the latent manifold using Lie group operators parameterized by coefficients with a sparsity-promoting prior.
A variational distribution over these coefficients provides a generative model of the manifold, with samples which provide feature augmentations applicable both during contrastive training and downstream tasks.
arXiv Detail & Related papers (2023-06-23T15:07:01Z) - Theoretical Characterization of How Neural Network Pruning Affects its
Generalization [131.1347309639727]
This work makes the first attempt to study how different pruning fractions affect the model's gradient descent dynamics and generalization.
It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero.
More surprisingly, the generalization bound gets better as the pruning fraction gets larger.
arXiv Detail & Related papers (2023-01-01T03:10:45Z) - Explaining Image Classifiers Using Contrastive Counterfactuals in
Generative Latent Spaces [12.514483749037998]
We introduce a novel method to generate causal and yet interpretable counterfactual explanations for image classifiers.
We use this framework to obtain contrastive and causal sufficiency and necessity scores as global explanations for black-box classifiers.
arXiv Detail & Related papers (2022-06-10T17:54:46Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - AL2: Progressive Activation Loss for Learning General Representations in
Classification Neural Networks [12.14537824884951]
We propose a novel regularization method that progressively penalizes the magnitude of activations during training.
Our method's effect on generalization is analyzed with label randomization tests and cumulative ablations.
arXiv Detail & Related papers (2020-03-07T18:38:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.