Related papers: Correct Normalization Matters: Understanding the Effect of Normalization On Deep Neural Network Models For Click-Through Rate Prediction

Correct Normalization Matters: Understanding the Effect of Normalization On Deep Neural Network Models For Click-Through Rate Prediction

URL: http://arxiv.org/abs/2006.12753v2
Date: Tue, 7 Jul 2020 04:19:09 GMT
Title: Correct Normalization Matters: Understanding the Effect of Normalization On Deep Neural Network Models For Click-Through Rate Prediction
Authors: Zhiqiang Wang, Qingyun She, PengTao Zhang, Junlin Zhang
Abstract summary: We propose a new and effective normalization approaches based on LayerNorm named variance only LayerNorm(VO-LN) in this work. We find that the variance of normalization plays the main role and give an explanation in this work.
Score: 3.201333208812837
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Normalization has become one of the most fundamental components in many deep neural networks for machine learning tasks while deep neural network has also been widely used in CTR estimation field. Among most of the proposed deep neural network models, few model utilize normalization approaches. Though some works such as Deep & Cross Network (DCN) and Neural Factorization Machine (NFM) use Batch Normalization in MLP part of the structure, there isn't work to thoroughly explore the effect of the normalization on the DNN ranking systems. In this paper, we conduct a systematic study on the effect of widely used normalization schemas by applying the various normalization approaches to both feature embedding and MLP part in DNN model. Extensive experiments are conduct on three real-world datasets and the experiment results demonstrate that the correct normalization significantly enhances model's performance. We also propose a new and effective normalization approaches based on LayerNorm named variance only LayerNorm(VO-LN) in this work. A normalization enhanced DNN model named NormDNN is also proposed based on the above-mentioned observation. As for the reason why normalization works for DNN models in CTR estimation, we find that the variance of normalization plays the main role and give an explanation in this work.

Related papers

Unsupervised Adaptive Normalization [0.07499722271664146]
Unsupervised Adaptive Normalization (UAN) is an innovative algorithm that seamlessly integrates clustering for normalization with deep neural network learning. UAN outperforms the classical methods by adapting to the target task and is effective in classification, and domain adaptation.
arXiv Detail & Related papers (2024-09-07T08:14:11Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Context Normalization Layer with Applications [0.1499944454332829]
This study proposes a new normalization technique, called context normalization, for image data. It adjusts the scaling of features based on the characteristics of each sample, which improves the model's convergence speed and performance. The effectiveness of context normalization is demonstrated on various datasets, and its performance is compared to other standard normalization techniques.
arXiv Detail & Related papers (2023-03-14T06:38:17Z)
Rank-R FNN: A Tensor-Based Learning Model for High-Order Data Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters. First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension. We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop. A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z)
Neuron Coverage-Guided Domain Generalization [37.77033512313927]
This paper focuses on the domain generalization task where domain knowledge is unavailable, and even worse, only samples from a single domain can be utilized during training. Our motivation originates from the recent progresses in deep neural network (DNN) testing, which has shown that maximizing neuron coverage of DNN can help to explore possible defects of DNN.
arXiv Detail & Related papers (2021-02-27T14:26:53Z)
Normalization Techniques in Training DNNs: Methodology, Analysis and Application [111.82265258916397]
Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs) This paper reviews and comments on the past, present and future of normalization methods in the context of training.
arXiv Detail & Related papers (2020-09-27T13:06:52Z)
On Connections between Regularizations for Improving DNN Robustness [67.28077776415724]
This paper analyzes regularization terms proposed recently for improving the adversarial robustness of deep neural networks (DNNs) We study possible connections between several effective methods, including input-gradient regularization, Jacobian regularization, curvature regularization, and a cross-Lipschitz functional.
arXiv Detail & Related papers (2020-07-04T23:43:32Z)
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers [82.61117235807606]
The success of deep neural networks in part due to the use of normalization layers. Our analysis shows how the introduction of normalization changes the landscape and can enable faster activation.
arXiv Detail & Related papers (2020-06-11T23:55:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.