Dual-Branch Temperature Scaling Calibration for Long-Tailed Recognition
- URL: http://arxiv.org/abs/2308.08366v1
- Date: Wed, 16 Aug 2023 13:40:58 GMT
- Title: Dual-Branch Temperature Scaling Calibration for Long-Tailed Recognition
- Authors: Jialin Guo, Zhenyu Wu, Zhiqiang Zhan, Yang Ji
- Abstract summary: This paper proposes a dual-branch temperature scaling calibration model (Dual-TS)
It considers the diversities in temperature parameters of different categories and the non-generalizability of temperature parameters for rare samples in minority classes simultaneously.
Our model yields state-of-the-art in both traditional ECE and Esbin-ECE metrics.
- Score: 19.12557383199547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The calibration for deep neural networks is currently receiving widespread
attention and research. Miscalibration usually leads to overconfidence of the
model. While, under the condition of long-tailed distribution of data, the
problem of miscalibration is more prominent due to the different confidence
levels of samples in minority and majority categories, and it will result in
more serious overconfidence. To address this problem, some current research
have designed diverse temperature coefficients for different categories based
on temperature scaling (TS) method. However, in the case of rare samples in
minority classes, the temperature coefficient is not generalizable, and there
is a large difference between the temperature coefficients of the training set
and the validation set. To solve this challenge, this paper proposes a
dual-branch temperature scaling calibration model (Dual-TS), which considers
the diversities in temperature parameters of different categories and the
non-generalizability of temperature parameters for rare samples in minority
classes simultaneously. Moreover, we noticed that the traditional calibration
evaluation metric, Excepted Calibration Error (ECE), gives a higher weight to
low-confidence samples in the minority classes, which leads to inaccurate
evaluation of model calibration. Therefore, we also propose Equal Sample Bin
Excepted Calibration Error (Esbin-ECE) as a new calibration evaluation metric.
Through experiments, we demonstrate that our model yields state-of-the-art in
both traditional ECE and Esbin-ECE metrics.
Related papers
- Calibrating Language Models with Adaptive Temperature Scaling [58.056023173579625]
We introduce Adaptive Temperature Scaling (ATS), a post-hoc calibration method that predicts a temperature scaling parameter for each token prediction.
ATS improves calibration by over 10-50% across three downstream natural language evaluation benchmarks compared to prior calibration methods.
arXiv Detail & Related papers (2024-09-29T22:54:31Z) - A Confidence Interval for the $\ell_2$ Expected Calibration Error [35.88784957918326]
We develop confidence intervals $ell$ Expected the Error (ECE)
We consider top-1-to-$k$ calibration, which includes both the popular notion of confidence calibration as well as calibration.
For a debiased estimator of the ECE, we show normality, but with different convergence rates and variances for calibrated and misd models.
arXiv Detail & Related papers (2024-08-16T20:00:08Z) - On the Limitations of Temperature Scaling for Distributions with
Overlaps [8.486166869140929]
We show that for empirical risk minimizers for a general set of distributions, the performance of temperature scaling degrades with the amount of overlap between classes.
We prove that optimizing a modified form of the empirical risk induced by the Mixup data augmentation technique can in fact lead to reasonably good calibration performance.
arXiv Detail & Related papers (2023-06-01T14:35:28Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - Enabling Calibration In The Zero-Shot Inference of Large Vision-Language
Models [58.720142291102135]
We measure calibration across relevant variables like prompt, dataset, and architecture, and find that zero-shot inference with CLIP is miscalibrated.
A single learned temperature generalizes for each specific CLIP model across inference dataset and prompt choice.
arXiv Detail & Related papers (2023-03-11T17:14:04Z) - Variable-Based Calibration for Machine Learning Classifiers [11.9995808096481]
We introduce the notion of variable-based calibration to characterize calibration properties of a model.
We find that models with near-perfect expected calibration error can exhibit significant miscalibration as a function of features of the data.
arXiv Detail & Related papers (2022-09-30T00:49:31Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Revisiting Calibration for Question Answering [16.54743762235555]
We argue that the traditional evaluation of calibration does not reflect usefulness of the model confidence.
We propose a new calibration metric, MacroCE, that better captures whether the model assigns low confidence to wrong predictions and high confidence to correct predictions.
arXiv Detail & Related papers (2022-05-25T05:49:56Z) - Parameterized Temperature Scaling for Boosting the Expressive Power in
Post-Hoc Uncertainty Calibration [57.568461777747515]
We introduce a novel calibration method, Parametrized Temperature Scaling (PTS)
We demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power.
We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
arXiv Detail & Related papers (2021-02-24T10:18:30Z) - Uncertainty Quantification and Deep Ensembles [79.4957965474334]
We show that deep-ensembles do not necessarily lead to improved calibration properties.
We show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models.
This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce.
arXiv Detail & Related papers (2020-07-17T07:32:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.