AdaFocal: Calibration-aware Adaptive Focal Loss
- URL: http://arxiv.org/abs/2211.11838v2
- Date: Fri, 16 Jun 2023 23:30:57 GMT
- Title: AdaFocal: Calibration-aware Adaptive Focal Loss
- Authors: Arindam Ghosh, Thomas Schaaf, Matthew R. Gormley
- Abstract summary: Training with focal loss leads to better calibration than cross-entropy.
We propose a calibration-aware adaptive focal loss called AdaFocal.
- Score: 8.998525155518836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Much recent work has been devoted to the problem of ensuring that a neural
network's confidence scores match the true probability of being correct, i.e.
the calibration problem. Of note, it was found that training with focal loss
leads to better calibration than cross-entropy while achieving similar level of
accuracy \cite{mukhoti2020}. This success stems from focal loss regularizing
the entropy of the model's prediction (controlled by the parameter $\gamma$),
thereby reining in the model's overconfidence. Further improvement is expected
if $\gamma$ is selected independently for each training sample
(Sample-Dependent Focal Loss (FLSD-53) \cite{mukhoti2020}). However, FLSD-53 is
based on heuristics and does not generalize well. In this paper, we propose a
calibration-aware adaptive focal loss called AdaFocal that utilizes the
calibration properties of focal (and inverse-focal) loss and adaptively
modifies $\gamma_t$ for different groups of samples based on $\gamma_{t-1}$
from the previous step and the knowledge of model's under/over-confidence on
the validation set. We evaluate AdaFocal on various image recognition and one
NLP task, covering a wide variety of network architectures, to confirm the
improvement in calibration while achieving similar levels of accuracy.
Additionally, we show that models trained with AdaFocal achieve a significant
boost in out-of-distribution detection.
Related papers
- Calibrating Deep Neural Network using Euclidean Distance [5.675312975435121]
In machine learning, Focal Loss is commonly used to reduce misclassification rates by emphasizing hard-to-classify samples.
High calibration error indicates a misalignment between predicted probabilities and actual outcomes, affecting model reliability.
This research introduces a novel loss function called Focal Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.
arXiv Detail & Related papers (2024-10-23T23:06:50Z) - Feature Clipping for Uncertainty Calibration [24.465567005078135]
Modern deep neural networks (DNNs) often suffer from overconfidence, leading to miscalibration.
We propose a novel post-hoc calibration method called feature clipping (FC) to address this issue.
FC involves clipping feature values to a specified threshold, effectively increasing entropy in high calibration error samples.
arXiv Detail & Related papers (2024-10-16T06:44:35Z) - Optimizing Calibration by Gaining Aware of Prediction Correctness [30.619608580138802]
Cross-Entropy (CE) loss is widely used for calibrator training, which enforces the model to increase confidence on the ground truth class.
We propose a new post-hoc calibration objective derived from the aim of calibration.
arXiv Detail & Related papers (2024-04-19T17:25:43Z) - Proximity-Informed Calibration for Deep Neural Networks [49.330703634912915]
ProCal is a plug-and-play algorithm with a theoretical guarantee to adjust sample confidence based on proximity.
We show that ProCal is effective in addressing proximity bias and improving calibration on balanced, long-tail, and distribution-shift settings.
arXiv Detail & Related papers (2023-06-07T16:40:51Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - Beyond calibration: estimating the grouping loss of modern neural
networks [68.8204255655161]
Proper scoring rule theory shows that given the calibration loss, the missing piece to characterize individual errors is the grouping loss.
We show that modern neural network architectures in vision and NLP exhibit grouping loss, notably in distribution shifts settings.
arXiv Detail & Related papers (2022-10-28T07:04:20Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Bayesian Confidence Calibration for Epistemic Uncertainty Modelling [4.358626952482686]
We introduce a framework to obtain confidence estimates in conjunction with an uncertainty of the calibration method.
We achieve state-of-the-art calibration performance for object detection calibration.
arXiv Detail & Related papers (2021-09-21T10:53:16Z) - Post-hoc Calibration of Neural Networks by g-Layers [51.42640515410253]
In recent years, there is a surge of research on neural network calibration.
It is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained.
We prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network.
arXiv Detail & Related papers (2020-06-23T07:55:10Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.