On Calibration of Mixup Training for Deep Neural Networks
- URL: http://arxiv.org/abs/2003.09946v4
- Date: Thu, 28 Jan 2021 11:39:19 GMT
- Title: On Calibration of Mixup Training for Deep Neural Networks
- Authors: Juan Maro\~nas and Daniel Ramos and Roberto Paredes
- Abstract summary: We argue and provide empirical evidence that, due to its fundamentals, Mixup does not necessarily improve calibration.
Our loss is inspired by Bayes decision theory and introduces a new training framework for designing losses for probabilistic modelling.
We provide state-of-the-art accuracy with consistent improvements in calibration performance.
- Score: 1.6242924916178283
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks (DNN) represent the state of the art in many tasks.
However, due to their overparameterization, their generalization capabilities
are in doubt and still a field under study. Consequently, DNN can overfit and
assign overconfident predictions -- effects that have been shown to affect the
calibration of the confidences assigned to unseen data. Data Augmentation (DA)
strategies have been proposed to regularize these models, being Mixup one of
the most popular due to its ability to improve the accuracy, the uncertainty
quantification and the calibration of DNN. In this work however we argue and
provide empirical evidence that, due to its fundamentals, Mixup does not
necessarily improve calibration. Based on our observations we propose a new
loss function that improves the calibration, and also sometimes the accuracy,
of DNN trained with this DA technique. Our loss is inspired by Bayes decision
theory and introduces a new training framework for designing losses for
probabilistic modelling. We provide state-of-the-art accuracy with consistent
improvements in calibration performance. Appendix and code are provided here:
https://github.com/jmaronas/calibration_MixupDNN_ARCLoss.pytorch.git
Related papers
- Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - DOMINO: Domain-aware Model Calibration in Medical Image Segmentation [51.346121016559024]
Modern deep neural networks are poorly calibrated, compromising trustworthiness and reliability.
We propose DOMINO, a domain-aware model calibration method that leverages the semantic confusability and hierarchical similarity between class labels.
Our results show that DOMINO-calibrated deep neural networks outperform non-calibrated models and state-of-the-art morphometric methods in head image segmentation.
arXiv Detail & Related papers (2022-09-13T15:31:52Z) - An Underexplored Dilemma between Confidence and Calibration in Quantized
Neural Networks [0.0]
Modern convolutional neural networks (CNNs) are known to be overconfident in terms of their calibration on unseen input data.
This is undesirable if the probabilities predicted are to be used for downstream decision making.
We show that this robustness can be partially explained by the calibration behavior of modern CNNs, and may be improved with overconfidence.
arXiv Detail & Related papers (2021-11-10T14:37:16Z) - Improving Uncertainty Calibration of Deep Neural Networks via Truth
Discovery and Geometric Optimization [22.57474734944132]
We propose a truth discovery framework to integrate ensemble-based and post-hoc calibration methods.
On large-scale datasets including CIFAR and ImageNet, our method shows consistent improvement against state-of-the-art calibration approaches.
arXiv Detail & Related papers (2021-06-25T06:44:16Z) - On the Dark Side of Calibration for Modern Neural Networks [65.83956184145477]
We show the breakdown of expected calibration error (ECE) into predicted confidence and refinement.
We highlight that regularisation based calibration only focuses on naively reducing a model's confidence.
We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement.
arXiv Detail & Related papers (2021-06-17T11:04:14Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.