Individual Calibration with Randomized Forecasting
- URL: http://arxiv.org/abs/2006.10288v3
- Date: Wed, 9 Sep 2020 07:49:01 GMT
- Title: Individual Calibration with Randomized Forecasting
- Authors: Shengjia Zhao, Tengyu Ma, Stefano Ermon
- Abstract summary: We show that calibration for individual samples is possible in the regression setup if the predictions are randomized.
We design a training objective to enforce individual calibration and use it to train randomized regression functions.
- Score: 116.2086707626651
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning applications often require calibrated predictions, e.g. a
90\% credible interval should contain the true outcome 90\% of the times.
However, typical definitions of calibration only require this to hold on
average, and offer no guarantees on predictions made on individual samples.
Thus, predictions can be systematically over or under confident on certain
subgroups, leading to issues of fairness and potential vulnerabilities. We show
that calibration for individual samples is possible in the regression setup if
the predictions are randomized, i.e. outputting randomized credible intervals.
Randomization removes systematic bias by trading off bias with variance. We
design a training objective to enforce individual calibration and use it to
train randomized regression functions. The resulting models are more calibrated
for arbitrarily chosen subgroups of the data, and can achieve higher utility in
decision making against adversaries that exploit miscalibrated predictions.
Related papers
- Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.
Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.
We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Domain-adaptive and Subgroup-specific Cascaded Temperature Regression
for Out-of-distribution Calibration [16.930766717110053]
We propose a novel meta-set-based cascaded temperature regression method for post-hoc calibration.
We partition each meta-set into subgroups based on predicted category and confidence level, capturing diverse uncertainties.
A regression network is then trained to derive category-specific and confidence-level-specific scaling, achieving calibration across meta-sets.
arXiv Detail & Related papers (2024-02-14T14:35:57Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Test-time Recalibration of Conformal Predictors Under Distribution Shift
Based on Unlabeled Examples [30.61588337557343]
Conformal predictors provide uncertainty estimates by computing a set of classes with a user-specified probability.
We propose a method that provides excellent uncertainty estimates under natural distribution shifts.
arXiv Detail & Related papers (2022-10-09T04:46:00Z) - Calibrated Selective Classification [34.08454890436067]
We develop a new approach to selective classification in which we propose a method for rejecting examples with "uncertain" uncertainties.
We present a framework for learning selectively calibrated models, where a separate selector network is trained to improve the selective calibration error of a given base model.
We demonstrate the empirical effectiveness of our approach on multiple image classification and lung cancer risk assessment tasks.
arXiv Detail & Related papers (2022-08-25T13:31:09Z) - Posterior Probability Matters: Doubly-Adaptive Calibration for Neural Predictions in Online Advertising [29.80454356173723]
Field-level calibration is fine-grained and more practical.
AdaCalib learns an isotonic function family to calibrate model predictions.
Experiments verify that AdaCalib achieves significant improvement on calibration performance.
arXiv Detail & Related papers (2022-05-15T14:27:19Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy.
We present a framework that treats these two desiderata jointly.
We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z) - Prediction Confidence from Neighbors [0.0]
The inability of Machine Learning (ML) models to successfully extrapolate correct predictions from out-of-distribution (OoD) samples is a major hindrance to the application of ML in critical applications.
We show that feature space distance is a meaningful measure that can provide confidence in predictions.
This enables earlier and safer deployment of models in critical applications and is vital for deploying models under ever-changing conditions.
arXiv Detail & Related papers (2020-03-31T09:26:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.