A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved
Neural Network Calibration
- URL: http://arxiv.org/abs/2203.13834v1
- Date: Fri, 25 Mar 2022 18:02:13 GMT
- Title: A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved
Neural Network Calibration
- Authors: Ramya Hebbalaguppe, Jatin Prakash, Neelabh Madan, Chetan Arora
- Abstract summary: We propose a novel auxiliary loss function: Multi-class Difference in Confidence and Accuracy ( MDCA )
We show that training with MDCA leads to better-calibrated models in terms of Expected Error ( ECE), and Static Error ( SCE ) on image classification, and segmentation tasks.
- Score: 12.449806152650657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks ( DNN s) are known to make overconfident mistakes, which
makes their use problematic in safety-critical applications. State-of-the-art (
SOTA ) calibration techniques improve on the confidence of predicted labels
alone and leave the confidence of non-max classes (e.g. top-2, top-5)
uncalibrated. Such calibration is not suitable for label refinement using
post-processing. Further, most SOTA techniques learn a few hyper-parameters
post-hoc, leaving out the scope for image, or pixel specific calibration. This
makes them unsuitable for calibration under domain shift, or for dense
prediction tasks like semantic segmentation. In this paper, we argue for
intervening at the train time itself, so as to directly produce calibrated DNN
models. We propose a novel auxiliary loss function: Multi-class Difference in
Confidence and Accuracy ( MDCA ), to achieve the same MDCA can be used in
conjunction with other application/task-specific loss functions. We show that
training with MDCA leads to better-calibrated models in terms of Expected
Calibration Error ( ECE ), and Static Calibration Error ( SCE ) on image
classification, and segmentation tasks. We report ECE ( SCE ) score of 0.72
(1.60) on the CIFAR 100 dataset, in comparison to 1.90 (1.71) by the SOTA.
Under domain shift, a ResNet-18 model trained on PACS dataset using MDCA gives
an average ECE ( SCE ) score of 19.7 (9.7) across all domains, compared to 24.2
(11.8) by the SOTA. For the segmentation task, we report a 2X reduction in
calibration error on PASCAL - VOC dataset in comparison to Focal Loss. Finally,
MDCA training improves calibration even on imbalanced data, and for natural
language classification tasks. We have released the code here: code is
available at https://github.com/mdca-loss
Related papers
- Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We show that calibration error and refinement error are not minimized simultaneously during training.
We introduce a new metric for early stopping and hyper parameter tuning that makes it possible to minimize refinement error during training.
Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks.
arXiv Detail & Related papers (2025-01-31T15:03:54Z) - ForeCal: Random Forest-based Calibration for DNNs [0.0]
We propose ForeCal, a novel post-hoc calibration algorithm based on Random forests.
ForeCal exploits two unique properties of Random forests: the ability to enforce weak monotonicity and range-preservation.
We show that ForeCal outperforms existing methods in terms of Expected Error(ECE) with minimal impact on the discriminative power of the base as measured by AUC.
arXiv Detail & Related papers (2024-09-04T04:56:41Z) - Average Calibration Error: A Differentiable Loss for Improved
Reliability in Image Segmentation [17.263160921956445]
We propose to use marginal L1 average calibration error (mL1-ACE) as a novel auxiliary loss function to improve pixel-wise calibration without compromising segmentation quality.
We show that this loss, despite using hard binning, is directly differentiable, bypassing the need for approximate but differentiable surrogate or soft binning approaches.
arXiv Detail & Related papers (2024-03-11T14:31:03Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - DOMINO: Domain-aware Model Calibration in Medical Image Segmentation [51.346121016559024]
Modern deep neural networks are poorly calibrated, compromising trustworthiness and reliability.
We propose DOMINO, a domain-aware model calibration method that leverages the semantic confusability and hierarchical similarity between class labels.
Our results show that DOMINO-calibrated deep neural networks outperform non-calibrated models and state-of-the-art morphometric methods in head image segmentation.
arXiv Detail & Related papers (2022-09-13T15:31:52Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - On the Dark Side of Calibration for Modern Neural Networks [65.83956184145477]
We show the breakdown of expected calibration error (ECE) into predicted confidence and refinement.
We highlight that regularisation based calibration only focuses on naively reducing a model's confidence.
We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement.
arXiv Detail & Related papers (2021-06-17T11:04:14Z) - Transferable Calibration with Lower Bias and Variance in Domain
Adaptation [139.4332115349543]
Domain Adaptation (DA) enables transferring a learning machine from a labeled source domain to an unlabeled target one.
How to estimate the predictive uncertainty of DA models is vital for decision-making in safety-critical scenarios.
TransCal can be easily applied to recalibrate existing DA methods.
arXiv Detail & Related papers (2020-07-16T11:09:36Z) - On Calibration of Mixup Training for Deep Neural Networks [1.6242924916178283]
We argue and provide empirical evidence that, due to its fundamentals, Mixup does not necessarily improve calibration.
Our loss is inspired by Bayes decision theory and introduces a new training framework for designing losses for probabilistic modelling.
We provide state-of-the-art accuracy with consistent improvements in calibration performance.
arXiv Detail & Related papers (2020-03-22T16:54:31Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.