Related papers: New Interpretations of Normalization Methods in Deep Learning

New Interpretations of Normalization Methods in Deep Learning

URL: http://arxiv.org/abs/2006.09104v1
Date: Tue, 16 Jun 2020 12:26:13 GMT
Title: New Interpretations of Normalization Methods in Deep Learning
Authors: Jiacheng Sun, Xiangyong Cao, Hanwen Liang, Weiran Huang, Zewei Chen, Zhenguo Li
Abstract summary: We use these tools to make a deep analysis on popular normalization methods. Most of the normalization methods can be interpreted in a unified framework. We prove that training with these normalization methods can make the norm of weights increase, which could cause adversarial vulnerability as it amplifies the attack.
Score: 41.29746794151102
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc. However, mathematical tools to analyze all these normalization methods are lacking. In this paper, we first propose a lemma to define some necessary tools. Then, we use these tools to make a deep analysis on popular normalization methods and obtain the following conclusions: 1) Most of the normalization methods can be interpreted in a unified framework, namely normalizing pre-activations or weights onto a sphere; 2) Since most of the existing normalization methods are scaling invariant, we can conduct optimization on a sphere with scaling symmetry removed, which can help stabilize the training of network; 3) We prove that training with these normalization methods can make the norm of weights increase, which could cause adversarial vulnerability as it amplifies the attack. Finally, a series of experiments are conducted to verify these claims.

Related papers

Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball. We propose a new family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems.
arXiv Detail & Related papers (2025-02-11T13:10:34Z)
Preconditioning for Accelerated Gradient Descent Optimization and Regularization [2.306205094107543]
We explain how preconditioning with AdaGrad, RMSProp, and Adam accelerates training. We demonstrate how normalization methods accelerate training by improving Hessian conditioning.
arXiv Detail & Related papers (2024-09-30T20:58:39Z)
Enhancing Neural Network Representations with Prior Knowledge-Based Normalization [0.07499722271664146]
We introduce a new approach to multi-mode normalization that leverages prior knowledge to improve neural network representations. Our methods demonstrate superior convergence and performance across tasks in image classification, domain adaptation, and image generation.
arXiv Detail & Related papers (2024-03-25T14:17:38Z)
A Revisit of the Normalized Eight-Point Algorithm and A Self-Supervised Deep Solution [45.10109739084541]
We revisit the normalized eight-point algorithm and present the existence of different and better normalization algorithms. We introduce a deep convolutional neural network with a self-supervised learning strategy for normalization. Our learning-based normalization module can be integrated with both traditional (e.g., RANSAC) and deep learning frameworks.
arXiv Detail & Related papers (2023-04-21T06:41:17Z)
Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies. We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z)
Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks. In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings. We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z)
Normalization Techniques in Training DNNs: Methodology, Analysis and Application [111.82265258916397]
Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs) This paper reviews and comments on the past, present and future of normalization methods in the context of training.
arXiv Detail & Related papers (2020-09-27T13:06:52Z)
Density Fixing: Simple yet Effective Regularization Method based on the Class Prior [2.3859169601259347]
We propose a framework of regularization methods, called density-fixing, that can be used commonly for supervised and semi-supervised learning. Our proposed regularization method improves the generalization performance by forcing the model to approximate the class's prior distribution or the frequency of occurrence.
arXiv Detail & Related papers (2020-07-08T04:58:22Z)
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers [82.61117235807606]
The success of deep neural networks in part due to the use of normalization layers. Our analysis shows how the introduction of normalization changes the landscape and can enable faster activation.
arXiv Detail & Related papers (2020-06-11T23:55:54Z)
Regularizing Meta-Learning via Gradient Dropout [102.29924160341572]
meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize. We introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning.
arXiv Detail & Related papers (2020-04-13T10:47:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.