New Interpretations of Normalization Methods in Deep Learning
- URL: http://arxiv.org/abs/2006.09104v1
- Date: Tue, 16 Jun 2020 12:26:13 GMT
- Title: New Interpretations of Normalization Methods in Deep Learning
- Authors: Jiacheng Sun, Xiangyong Cao, Hanwen Liang, Weiran Huang, Zewei Chen,
Zhenguo Li
- Abstract summary: We use these tools to make a deep analysis on popular normalization methods.
Most of the normalization methods can be interpreted in a unified framework.
We prove that training with these normalization methods can make the norm of weights increase, which could cause adversarial vulnerability as it amplifies the attack.
- Score: 41.29746794151102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, a variety of normalization methods have been proposed to
help train neural networks, such as batch normalization (BN), layer
normalization (LN), weight normalization (WN), group normalization (GN), etc.
However, mathematical tools to analyze all these normalization methods are
lacking. In this paper, we first propose a lemma to define some necessary
tools. Then, we use these tools to make a deep analysis on popular
normalization methods and obtain the following conclusions: 1) Most of the
normalization methods can be interpreted in a unified framework, namely
normalizing pre-activations or weights onto a sphere; 2) Since most of the
existing normalization methods are scaling invariant, we can conduct
optimization on a sphere with scaling symmetry removed, which can help
stabilize the training of network; 3) We prove that training with these
normalization methods can make the norm of weights increase, which could cause
adversarial vulnerability as it amplifies the attack. Finally, a series of
experiments are conducted to verify these claims.
Related papers
- Preconditioning for Accelerated Gradient Descent Optimization and Regularization [2.306205094107543]
We explain how preconditioning with AdaGrad, RMSProp, and Adam accelerates training.
We demonstrate how normalization methods accelerate training by improving Hessian conditioning.
arXiv Detail & Related papers (2024-09-30T20:58:39Z) - Enhancing Neural Network Representations with Prior Knowledge-Based Normalization [0.07499722271664146]
We introduce a new approach to multi-mode normalization that leverages prior knowledge to improve neural network representations.
Our methods demonstrate superior convergence and performance across tasks in image classification, domain adaptation, and image generation.
arXiv Detail & Related papers (2024-03-25T14:17:38Z) - A Revisit of the Normalized Eight-Point Algorithm and A Self-Supervised
Deep Solution [45.10109739084541]
We revisit the normalized eight-point algorithm and present the existence of different and better normalization algorithms.
We introduce a deep convolutional neural network with a self-supervised learning strategy for normalization.
Our learning-based normalization module can be integrated with both traditional (e.g., RANSAC) and deep learning frameworks.
arXiv Detail & Related papers (2023-04-21T06:41:17Z) - Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies.
We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z) - Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data
to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks.
In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings.
We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z) - Normalization Techniques in Training DNNs: Methodology, Analysis and
Application [111.82265258916397]
Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs)
This paper reviews and comments on the past, present and future of normalization methods in the context of training.
arXiv Detail & Related papers (2020-09-27T13:06:52Z) - Density Fixing: Simple yet Effective Regularization Method based on the
Class Prior [2.3859169601259347]
We propose a framework of regularization methods, called density-fixing, that can be used commonly for supervised and semi-supervised learning.
Our proposed regularization method improves the generalization performance by forcing the model to approximate the class's prior distribution or the frequency of occurrence.
arXiv Detail & Related papers (2020-07-08T04:58:22Z) - Optimization Theory for ReLU Neural Networks Trained with Normalization
Layers [82.61117235807606]
The success of deep neural networks in part due to the use of normalization layers.
Our analysis shows how the introduction of normalization changes the landscape and can enable faster activation.
arXiv Detail & Related papers (2020-06-11T23:55:54Z) - Regularizing Meta-Learning via Gradient Dropout [102.29924160341572]
meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize.
We introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning.
arXiv Detail & Related papers (2020-04-13T10:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.