Soft Calibration Objectives for Neural Networks
- URL: http://arxiv.org/abs/2108.00106v1
- Date: Fri, 30 Jul 2021 23:30:20 GMT
- Title: Soft Calibration Objectives for Neural Networks
- Authors: Archit Karandikar, Nicholas Cain, Dustin Tran, Balaji
Lakshminarayanan, Jonathon Shlens, Michael C. Mozer, Becca Roelofs
- Abstract summary: We propose differentiable losses to improve calibration based on a soft (continuous) version of the binning operation underlying popular calibration-error estimators.
When incorporated into training, these soft calibration losses achieve state-of-the-art single-model ECE across multiple datasets with less than 1% decrease in accuracy.
- Score: 40.03050811956859
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Optimal decision making requires that classifiers produce uncertainty
estimates consistent with their empirical accuracy. However, deep neural
networks are often under- or over-confident in their predictions. Consequently,
methods have been developed to improve the calibration of their predictive
uncertainty both during training and post-hoc. In this work, we propose
differentiable losses to improve calibration based on a soft (continuous)
version of the binning operation underlying popular calibration-error
estimators. When incorporated into training, these soft calibration losses
achieve state-of-the-art single-model ECE across multiple datasets with less
than 1% decrease in accuracy. For instance, we observe an 82% reduction in ECE
(70% relative to the post-hoc rescaled ECE) in exchange for a 0.7% relative
decrease in accuracy relative to the cross entropy baseline on CIFAR-100. When
incorporated post-training, the soft-binning-based calibration error objective
improves upon temperature scaling, a popular recalibration method. Overall,
experiments across losses and datasets demonstrate that using
calibration-sensitive procedures yield better uncertainty estimates under
dataset shift than the standard practice of using a cross entropy loss and
post-hoc recalibration methods.
Related papers
- Calibrating Deep Neural Network using Euclidean Distance [5.675312975435121]
In machine learning, Focal Loss is commonly used to reduce misclassification rates by emphasizing hard-to-classify samples.
High calibration error indicates a misalignment between predicted probabilities and actual outcomes, affecting model reliability.
This research introduces a novel loss function called Focal Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.
arXiv Detail & Related papers (2024-10-23T23:06:50Z) - Feature Clipping for Uncertainty Calibration [24.465567005078135]
Modern deep neural networks (DNNs) often suffer from overconfidence, leading to miscalibration.
We propose a novel post-hoc calibration method called feature clipping (FC) to address this issue.
FC involves clipping feature values to a specified threshold, effectively increasing entropy in high calibration error samples.
arXiv Detail & Related papers (2024-10-16T06:44:35Z) - A Confidence Interval for the $\ell_2$ Expected Calibration Error [35.88784957918326]
We develop confidence intervals $ell$ Expected the Error (ECE)
We consider top-1-to-$k$ calibration, which includes both the popular notion of confidence calibration as well as calibration.
For a debiased estimator of the ECE, we show normality, but with different convergence rates and variances for calibrated and misd models.
arXiv Detail & Related papers (2024-08-16T20:00:08Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Parameterized Temperature Scaling for Boosting the Expressive Power in
Post-Hoc Uncertainty Calibration [57.568461777747515]
We introduce a novel calibration method, Parametrized Temperature Scaling (PTS)
We demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power.
We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
arXiv Detail & Related papers (2021-02-24T10:18:30Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Mitigating Bias in Calibration Error Estimation [28.46667300490605]
We introduce a simulation framework that allows us to empirically show that ECE_bin can systematically underestimate or overestimate the true calibration error.
We propose a simple alternative calibration error metric, ECE_sweep, in which the number of bins is chosen to be as large as possible.
arXiv Detail & Related papers (2020-12-15T23:28:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.