Related papers: Mitigating Neural Network Overconfidence with Logit Normalization

Mitigating Neural Network Overconfidence with Logit Normalization

URL: http://arxiv.org/abs/2205.09310v1
Date: Thu, 19 May 2022 03:45:18 GMT
Title: Mitigating Neural Network Overconfidence with Logit Normalization
Authors: Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, Yixuan Li
Abstract summary: neural networks produce abnormally high confidence for both in- and out-of-distribution inputs. We show that this issue can be mitigated through Logit Normalization (LogitNorm) Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output.
Score: 37.106755943446515
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Detecting out-of-distribution inputs is critical for safe deployment of machine learning models in the real world. However, neural networks are known to suffer from the overconfidence issue, where they produce abnormally high confidence for both in- and out-of-distribution inputs. In this work, we show that this issue can be mitigated through Logit Normalization (LogitNorm) -- a simple fix to the cross-entropy loss -- by enforcing a constant vector norm on the logits in training. Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output. Our key idea behind LogitNorm is thus to decouple the influence of output's norm during network optimization. Trained with LogitNorm, neural networks produce highly distinguishable confidence scores between in- and out-of-distribution data. Extensive experiments demonstrate the superiority of LogitNorm, reducing the average FPR95 by up to 42.30% on common benchmarks.

Related papers

Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z)
The Compact Support Neural Network [6.47243430672461]
We present a neuron generalization that has the standard dot-product-based neuron and the RBF neuron as two extreme cases of a shape parameter. We show how to avoid difficulties in training a neural network with such neurons, by starting with a trained standard neural network and gradually increasing the shape parameter to the desired value.
arXiv Detail & Related papers (2021-04-01T06:08:09Z)
Performance Bounds for Neural Network Estimators: Applications in Fault Detection [2.388501293246858]
We exploit recent results in quantifying the robustness of neural networks to construct and tune a model-based anomaly detector. In tuning, we specifically provide upper bounds on the rate of false alarms expected under normal operation.
arXiv Detail & Related papers (2021-03-22T19:23:08Z)
Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations. We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z)
Input Hessian Regularization of Neural Networks [31.941188983286207]
We propose an efficient algorithm to train deep neural networks with Hessian operator-norm regularization. We show that the new regularizer can, indeed, be feasible and, furthermore, that it increases the robustness of neural networks over input gradient regularization.
arXiv Detail & Related papers (2020-09-14T16:58:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.