Set Learning for Accurate and Calibrated Models
- URL: http://arxiv.org/abs/2307.02245v4
- Date: Mon, 12 Feb 2024 13:40:23 GMT
- Title: Set Learning for Accurate and Calibrated Models
- Authors: Lukas Muttenthaler and Robert A. Vandermeulen and Qiuyi Zhang and
Thomas Unterthiner and Klaus-Robert M\"uller
- Abstract summary: Odd-$k$-out learning minimizes the cross-entropy error for sets rather than for single examples.
OKO often yields better calibration even when training with hard labels and dropping any additional calibration parameter tuning.
- Score: 17.187117466317265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model overconfidence and poor calibration are common in machine learning and
difficult to account for when applying standard empirical risk minimization. In
this work, we propose a novel method to alleviate these problems that we call
odd-$k$-out learning (OKO), which minimizes the cross-entropy error for sets
rather than for single examples. This naturally allows the model to capture
correlations across data examples and achieves both better accuracy and
calibration, especially in limited training data and class-imbalanced regimes.
Perhaps surprisingly, OKO often yields better calibration even when training
with hard labels and dropping any additional calibration parameter tuning, such
as temperature scaling. We demonstrate this in extensive experimental analyses
and provide a mathematical theory to interpret our findings. We emphasize that
OKO is a general framework that can be easily adapted to many settings and a
trained model can be applied to single examples at inference time, without
significant run-time overhead or architecture changes.
Related papers
- Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE)
RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z) - Reassessing How to Compare and Improve the Calibration of Machine Learning Models [7.183341902583164]
A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction.
We show that there exist trivial recalibration approaches that can appear seemingly state-of-the-art unless calibration and prediction metrics are accompanied by additional generalization metrics.
arXiv Detail & Related papers (2024-06-06T13:33:45Z) - On the Limitations of Temperature Scaling for Distributions with
Overlaps [8.486166869140929]
We show that for empirical risk minimizers for a general set of distributions, the performance of temperature scaling degrades with the amount of overlap between classes.
We prove that optimizing a modified form of the empirical risk induced by the Mixup data augmentation technique can in fact lead to reasonably good calibration performance.
arXiv Detail & Related papers (2023-06-01T14:35:28Z) - Enabling Calibration In The Zero-Shot Inference of Large Vision-Language
Models [58.720142291102135]
We measure calibration across relevant variables like prompt, dataset, and architecture, and find that zero-shot inference with CLIP is miscalibrated.
A single learned temperature generalizes for each specific CLIP model across inference dataset and prompt choice.
arXiv Detail & Related papers (2023-03-11T17:14:04Z) - Variable-Based Calibration for Machine Learning Classifiers [11.9995808096481]
We introduce the notion of variable-based calibration to characterize calibration properties of a model.
We find that models with near-perfect expected calibration error can exhibit significant miscalibration as a function of features of the data.
arXiv Detail & Related papers (2022-09-30T00:49:31Z) - Modular Conformal Calibration [80.33410096908872]
We introduce a versatile class of algorithms for recalibration in regression.
This framework allows one to transform any regression model into a calibrated probabilistic model.
We conduct an empirical study of MCC on 17 regression datasets.
arXiv Detail & Related papers (2022-06-23T03:25:23Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Uncertainty Quantification and Deep Ensembles [79.4957965474334]
We show that deep-ensembles do not necessarily lead to improved calibration properties.
We show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models.
This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce.
arXiv Detail & Related papers (2020-07-17T07:32:24Z) - Quantile Regularization: Towards Implicit Calibration of Regression
Models [30.872605139672086]
We present a method for calibrating regression models based on a novel quantile regularizer defined as the cumulative KL divergence between two CDFs.
We show that the proposed quantile regularizer significantly improves calibration for regression models trained using approaches, such as Dropout VI and Deep Ensembles.
arXiv Detail & Related papers (2020-02-28T16:53:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.