Sample-dependent Adaptive Temperature Scaling for Improved Calibration
- URL: http://arxiv.org/abs/2207.06211v1
- Date: Wed, 13 Jul 2022 14:13:49 GMT
- Title: Sample-dependent Adaptive Temperature Scaling for Improved Calibration
- Authors: Tom Joy, Francesco Pinto, Ser-Nam Lim, Philip H. S. Torr, Puneet K.
Dokania
- Abstract summary: Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
- Score: 95.7477042886242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is now well known that neural networks can be wrong with high confidence
in their predictions, leading to poor calibration. The most common post-hoc
approach to compensate for this is to perform temperature scaling, which
adjusts the confidences of the predictions on any input by scaling the logits
by a fixed value. Whilst this approach typically improves the average
calibration across the whole test dataset, this improvement typically reduces
the individual confidences of the predictions irrespective of whether the
classification of a given input is correct or incorrect. With this insight, we
base our method on the observation that different samples contribute to the
calibration error by varying amounts, with some needing to increase their
confidence and others needing to decrease it. Therefore, for each input, we
propose to predict a different temperature value, allowing us to adjust the
mismatch between confidence and accuracy at a finer granularity. Furthermore,
we observe improved results on OOD detection and can also extract a notion of
hardness for the data-points. Our method is applied post-hoc, consequently
using very little computation time and with a negligible memory footprint and
is applied to off-the-shelf pre-trained classifiers. We test our method on the
ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and
Tiny-ImageNet datasets, showing that producing per-data-point temperatures is
beneficial also for the expected calibration error across the whole test set.
Code is available at: https://github.com/thwjoy/adats.
Related papers
- Calibrating Language Models with Adaptive Temperature Scaling [58.056023173579625]
We introduce Adaptive Temperature Scaling (ATS), a post-hoc calibration method that predicts a temperature scaling parameter for each token prediction.
ATS improves calibration by over 10-50% across three downstream natural language evaluation benchmarks compared to prior calibration methods.
arXiv Detail & Related papers (2024-09-29T22:54:31Z) - Improving Calibration by Relating Focal Loss, Temperature Scaling, and Properness [1.9055459597116435]
Cross-entropy incentivizes classifiers to produce class probabilities that are well-calibrated on the training data.
We show that focal loss can be decomposed into a confidence-raising transformation and a proper loss.
We propose focal temperature scaling - a new post-hoc calibration method combining focal calibration and temperature scaling.
arXiv Detail & Related papers (2024-08-21T13:10:44Z) - Domain-adaptive and Subgroup-specific Cascaded Temperature Regression
for Out-of-distribution Calibration [16.930766717110053]
We propose a novel meta-set-based cascaded temperature regression method for post-hoc calibration.
We partition each meta-set into subgroups based on predicted category and confidence level, capturing diverse uncertainties.
A regression network is then trained to derive category-specific and confidence-level-specific scaling, achieving calibration across meta-sets.
arXiv Detail & Related papers (2024-02-14T14:35:57Z) - Delving into temperature scaling for adaptive conformal prediction [10.340903334800787]
Conformal prediction, as an emerging uncertainty qualification technique, constructs prediction sets that are guaranteed to contain the true label with pre-defined probability.
We show that current confidence calibration methods (e.g., temperature scaling) normally lead to larger prediction sets in adaptive conformal prediction.
We propose $Conformal$ $Temperature$ $Scaling$ (ConfTS), a variant of temperature scaling that aims to improve the efficiency of adaptive conformal prediction.
arXiv Detail & Related papers (2024-02-06T19:27:48Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - Revisiting Calibration for Question Answering [16.54743762235555]
We argue that the traditional evaluation of calibration does not reflect usefulness of the model confidence.
We propose a new calibration metric, MacroCE, that better captures whether the model assigns low confidence to wrong predictions and high confidence to correct predictions.
arXiv Detail & Related papers (2022-05-25T05:49:56Z) - Parameterized Temperature Scaling for Boosting the Expressive Power in
Post-Hoc Uncertainty Calibration [57.568461777747515]
We introduce a novel calibration method, Parametrized Temperature Scaling (PTS)
We demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power.
We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
arXiv Detail & Related papers (2021-02-24T10:18:30Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.