Confidence Calibration with Bounded Error Using Transformations
- URL: http://arxiv.org/abs/2102.12680v1
- Date: Thu, 25 Feb 2021 04:40:27 GMT
- Title: Confidence Calibration with Bounded Error Using Transformations
- Authors: Sooyong Jang, Radoslav Ivanov, Insup lee, and James Weimer
- Abstract summary: We introduce Hoki, a novel calibration algorithm with a theoretical bound on the calibration error (CE)
We show that Hoki generally outperforms state-of-the-art calibration algorithms across multiple datasets and models.
In addition, Hoki is fast algorithm which is comparable to temperature scaling in terms of learning time.
- Score: 4.278591555984394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As machine learning techniques become widely adopted in new domains,
especially in safety-critical systems such as autonomous vehicles, it is
crucial to provide accurate output uncertainty estimation. As a result, many
approaches have been proposed to calibrate neural networks to accurately
estimate the likelihood of misclassification. However, while these methods
achieve low expected calibration error (ECE), few techniques provide
theoretical performance guarantees on the calibration error (CE). In this
paper, we introduce Hoki, a novel calibration algorithm with a theoretical
bound on the CE. Hoki works by transforming the neural network logits and/or
inputs and recursively performing calibration leveraging the information from
the corresponding change in the output. We provide a PAC-like bounds on CE that
is shown to decrease with the number of samples used for calibration, and
increase proportionally with ECE and the number of discrete bins used to
calculate ECE. We perform experiments on multiple datasets, including ImageNet,
and show that the proposed approach generally outperforms state-of-the-art
calibration algorithms across multiple datasets and models - providing nearly
an order or magnitude improvement in ECE on ImageNet. In addition, Hoki is fast
algorithm which is comparable to temperature scaling in terms of learning time.
Related papers
- Calibrating Bayesian Learning via Regularization, Confidence Minimization, and Selective Inference [37.82259435084825]
A well-calibrated AI model must correctly report its accuracy on in-distribution (ID) inputs, while also enabling the detection of out-of-distribution (OOD) inputs.
This paper proposes an extension of variational inference (VI)-based Bayesian learning that integrates calibration regularization for improved ID performance.
arXiv Detail & Related papers (2024-04-17T13:08:26Z) - Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO.
We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits.
Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Soft Calibration Objectives for Neural Networks [40.03050811956859]
We propose differentiable losses to improve calibration based on a soft (continuous) version of the binning operation underlying popular calibration-error estimators.
When incorporated into training, these soft calibration losses achieve state-of-the-art single-model ECE across multiple datasets with less than 1% decrease in accuracy.
arXiv Detail & Related papers (2021-07-30T23:30:20Z) - Parameterized Temperature Scaling for Boosting the Expressive Power in
Post-Hoc Uncertainty Calibration [57.568461777747515]
We introduce a novel calibration method, Parametrized Temperature Scaling (PTS)
We demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power.
We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
arXiv Detail & Related papers (2021-02-24T10:18:30Z) - Mitigating Bias in Calibration Error Estimation [28.46667300490605]
We introduce a simulation framework that allows us to empirically show that ECE_bin can systematically underestimate or overestimate the true calibration error.
We propose a simple alternative calibration error metric, ECE_sweep, in which the number of bins is chosen to be as large as possible.
arXiv Detail & Related papers (2020-12-15T23:28:06Z) - Multi-Class Uncertainty Calibration via Mutual Information
Maximization-based Binning [8.780958735684958]
Post-hoc multi-class calibration is a common approach for providing confidence estimates of deep neural network predictions.
Recent work has shown that widely used scaling methods underestimate their calibration error.
We propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes.
arXiv Detail & Related papers (2020-06-23T15:31:59Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.