When and How Mixup Improves Calibration
- URL: http://arxiv.org/abs/2102.06289v1
- Date: Thu, 11 Feb 2021 22:24:54 GMT
- Title: When and How Mixup Improves Calibration
- Authors: Linjun Zhang, Zhun Deng, Kenji Kawaguchi, James Zou
- Abstract summary: In many machine learning applications, it is important for the model to provide confidence scores that accurately captures its prediction uncertainty.
In this paper, we theoretically prove that Mixup improves calibration in textithigh-dimensional settings by investigating two natural data models.
While incorporating unlabeled data can sometimes make the model less calibrated, adding Mixup trainings this issue and provably improves calibration.
- Score: 19.11486078732542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many machine learning applications, it is important for the model to
provide confidence scores that accurately captures its prediction uncertainty.
Although modern learning methods have achieved great success in predictive
accuracy, generating calibrated confidence scores remains a major challenge.
Mixup, a popular yet simple data augmentation technique based on taking convex
combinations of pairs of training examples, has been empirically found to
significantly improve confidence calibration across diverse applications.
However, when and how Mixup helps calibration is still mysterious. In this
paper, we theoretically prove that Mixup improves calibration in
\textit{high-dimensional} settings by investigating two natural data models on
classification and regression. Interestingly, the calibration benefit of Mixup
increases as the model capacity increases. We support our theories with
experiments on common architectures and data sets. In addition, we study how
Mixup improves calibration in semi-supervised learning. While incorporating
unlabeled data can sometimes make the model less calibrated, adding Mixup
training mitigates this issue and provably improves calibration. Our analysis
provides new insights and a framework to understand Mixup and calibration.
Related papers
- Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - Tailoring Mixup to Data for Calibration [12.050401897136501]
Mixup is a technique for improving calibration and predictive uncertainty.
In this work, we argue that the likelihood of manifold intrusion increases with the distance between data to mix.
We propose to dynamically change the underlying distributions of coefficients depending on the similarity between samples to mix.
arXiv Detail & Related papers (2023-11-02T17:48:28Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - A Close Look into the Calibration of Pre-trained Language Models [56.998539510508515]
Pre-trained language models (PLMs) may fail in giving reliable estimates of their predictive uncertainty.
We study the dynamic change in PLMs' calibration performance in training.
We extend two recently proposed learnable methods that directly collect data to train models to have reasonable confidence estimations.
arXiv Detail & Related papers (2022-10-31T21:31:07Z) - On the Calibration of Pre-trained Language Models using Mixup Guided by
Area Under the Margin and Saliency [47.90235939359225]
We propose a novel mixup strategy for pre-trained language models that improves model calibration further.
Our method achieves the lowest expected calibration error compared to strong baselines on both in-domain and out-of-domain test samples.
arXiv Detail & Related papers (2022-03-14T23:45:08Z) - On the Dark Side of Calibration for Modern Neural Networks [65.83956184145477]
We show the breakdown of expected calibration error (ECE) into predicted confidence and refinement.
We highlight that regularisation based calibration only focuses on naively reducing a model's confidence.
We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement.
arXiv Detail & Related papers (2021-06-17T11:04:14Z) - Combining Ensembles and Data Augmentation can Harm your Calibration [33.94335246681807]
We show a surprising pathology: combining ensembles and data augmentation can harm model calibration.
We propose a simple correction, achieving the best of both worlds with significant accuracy and calibration gains over using only ensembles or data augmentation individually.
arXiv Detail & Related papers (2020-10-19T21:25:22Z) - Uncertainty Quantification and Deep Ensembles [79.4957965474334]
We show that deep-ensembles do not necessarily lead to improved calibration properties.
We show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models.
This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce.
arXiv Detail & Related papers (2020-07-17T07:32:24Z) - Diverse Ensembles Improve Calibration [14.678791405731486]
We propose a simple technique to improve calibration, using a different data augmentation for each ensemble member.
We additionally use the idea of mixing' un-augmented and augmented inputs to improve calibration when test and training distributions are the same.
arXiv Detail & Related papers (2020-07-08T15:48:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.