Conformal Prediction with Missing Values
- URL: http://arxiv.org/abs/2306.02732v1
- Date: Mon, 5 Jun 2023 09:28:03 GMT
- Title: Conformal Prediction with Missing Values
- Authors: Margaux Zaffran, Aymeric Dieuleveut, Julie Josse, Yaniv Romano
- Abstract summary: We first show that the marginal coverage guarantee of conformal prediction holds on imputed data for any missingness distribution.
We then show that a universally consistent quantile regression algorithm trained on the imputed data is Bayes optimal for the pinball risk.
- Score: 19.18178194789968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conformal prediction is a theoretically grounded framework for constructing
predictive intervals. We study conformal prediction with missing values in the
covariates -- a setting that brings new challenges to uncertainty
quantification. We first show that the marginal coverage guarantee of conformal
prediction holds on imputed data for any missingness distribution and almost
all imputation functions. However, we emphasize that the average coverage
varies depending on the pattern of missing values: conformal methods tend to
construct prediction intervals that under-cover the response conditionally to
some missing patterns. This motivates our novel generalized conformalized
quantile regression framework, missing data augmentation, which yields
prediction intervals that are valid conditionally to the patterns of missing
values, despite their exponential number. We then show that a universally
consistent quantile regression algorithm trained on the imputed data is Bayes
optimal for the pinball risk, thus achieving valid coverage conditionally to
any given data point. Moreover, we examine the case of a linear model, which
demonstrates the importance of our proposal in overcoming the
heteroskedasticity induced by missing values. Using synthetic and data from
critical care, we corroborate our theory and report improved performance of our
methods.
Related papers
- Progression: an extrapolation principle for regression [0.0]
We propose a novel statistical extrapolation principle.
It assumes a simple relationship between predictors and the response at the boundary of the training predictor samples.
Our semi-parametric method, progression, leverages this extrapolation principle and offers guarantees on the approximation error beyond the training data range.
arXiv Detail & Related papers (2024-10-30T17:29:51Z) - Conformal Prediction for Dose-Response Models with Continuous Treatments [0.23213238782019321]
We present a novel methodology for generating prediction intervals for dose-response models.
Our method approximates local coverage for every treatment value by applying kernel functions as weights in weighted conformal prediction.
arXiv Detail & Related papers (2024-09-30T15:40:54Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - Regression Trees for Fast and Adaptive Prediction Intervals [2.6763498831034043]
We present a family of methods to calibrate prediction intervals for regression problems with local coverage guarantees.
We create a partition by training regression trees and Random Forests on conformity scores.
Our proposal is versatile, as it applies to various conformity scores and prediction settings.
arXiv Detail & Related papers (2024-02-12T01:17:09Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Distribution-Free Finite-Sample Guarantees and Split Conformal
Prediction [0.0]
split conformal prediction represents a promising avenue to obtain finite-sample guarantees under minimal distribution-free assumptions.
We highlight the connection between split conformal prediction and classical tolerance predictors developed in the 1940s.
arXiv Detail & Related papers (2022-10-26T14:12:24Z) - Selective Regression Under Fairness Criteria [30.672082160544996]
In some cases, the performance of minority group can decrease while we reduce the coverage.
We show that such an unwanted behavior can be avoided if we can construct features satisfying the sufficiency criterion.
arXiv Detail & Related papers (2021-10-28T19:05:12Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.