Expectation consistency for calibration of neural networks
- URL: http://arxiv.org/abs/2303.02644v2
- Date: Sat, 5 Aug 2023 00:20:36 GMT
- Title: Expectation consistency for calibration of neural networks
- Authors: Lucas Clart\'e, Bruno Loureiro, Florent Krzakala, Lenka Zdeborov\'a
- Abstract summary: We introduce a novel calibration technique named expectation consistency (EC)
EC enforces that the average validation confidence coincides with the average proportion of correct labels.
We discuss examples where EC significantly outperforms temperature scaling.
- Score: 24.073221004661427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite their incredible performance, it is well reported that deep neural
networks tend to be overoptimistic about their prediction confidence. Finding
effective and efficient calibration methods for neural networks is therefore an
important endeavour towards better uncertainty quantification in deep learning.
In this manuscript, we introduce a novel calibration technique named
expectation consistency (EC), consisting of a post-training rescaling of the
last layer weights by enforcing that the average validation confidence
coincides with the average proportion of correct labels. First, we show that
the EC method achieves similar calibration performance to temperature scaling
(TS) across different neural network architectures and data sets, all while
requiring similar validation samples and computational resources. However, we
argue that EC provides a principled method grounded on a Bayesian optimality
principle known as the Nishimori identity. Next, we provide an asymptotic
characterization of both TS and EC in a synthetic setting and show that their
performance crucially depends on the target function. In particular, we discuss
examples where EC significantly outperforms TS.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Training-Free Neural Active Learning with Initialization-Robustness
Guarantees [27.38525683635627]
We introduce our expected variance with Gaussian processes (EV-GP) criterion for neural active learning.
Our EV-GP criterion is training-free, i.e., it does not require any training of the NN during data selection.
arXiv Detail & Related papers (2023-06-07T14:28:42Z) - A New PHO-rmula for Improved Performance of Semi-Structured Networks [0.0]
We show that techniques to properly identify the contributions of the different model components in SSNs lead to suboptimal network estimation.
We propose a non-invasive post-hocization (PHO) that guarantees identifiability of model components and provides better estimation and prediction quality.
Our theoretical findings are supported by numerical experiments, a benchmark comparison as well as a real-world application to COVID-19 infections.
arXiv Detail & Related papers (2023-06-01T10:23:28Z) - On the optimization and pruning for Bayesian deep learning [1.0152838128195467]
We propose a new adaptive variational Bayesian algorithm to train neural networks on weight space.
The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot.
Our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
arXiv Detail & Related papers (2022-10-24T05:18:08Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Robust and integrative Bayesian neural networks for likelihood-free
parameter inference [0.0]
State-of-the-art neural network-based methods for learning summary statistics have delivered promising results for simulation-based likelihood-free parameter inference.
This work proposes a robust integrated approach that learns summary statistics using Bayesian neural networks, and directly estimates the posterior density using categorical distributions.
arXiv Detail & Related papers (2021-02-12T13:45:23Z) - Uncertainty-Aware Deep Calibrated Salient Object Detection [74.58153220370527]
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.
These methods overlook the gap between network accuracy and prediction confidence, known as the confidence uncalibration problem.
We introduce an uncertaintyaware deep SOD network, and propose two strategies to prevent deep SOD networks from being overconfident.
arXiv Detail & Related papers (2020-12-10T23:28:36Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.