Conformal Prediction with Cellwise Outliers: A Detect-then-Impute Approach
- URL: http://arxiv.org/abs/2505.04986v1
- Date: Thu, 08 May 2025 06:43:18 GMT
- Title: Conformal Prediction with Cellwise Outliers: A Detect-then-Impute Approach
- Authors: Qian Peng, Yajie Bao, Haojie Ren, Zhaojun Wang, Changliang Zou,
- Abstract summary: Conformal prediction is a powerful tool for constructing prediction intervals for black-box models.<n>This paper introduces a novel framework called detect-then-impute conformal prediction.<n>We develop two practical algorithms, PDI-CP and JDI-CP, and provide a distribution-free coverage analysis.
- Score: 4.328280329592151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conformal prediction is a powerful tool for constructing prediction intervals for black-box models, providing a finite sample coverage guarantee for exchangeable data. However, this exchangeability is compromised when some entries of the test feature are contaminated, such as in the case of cellwise outliers. To address this issue, this paper introduces a novel framework called detect-then-impute conformal prediction. This framework first employs an outlier detection procedure on the test feature and then utilizes an imputation method to fill in those cells identified as outliers. To quantify the uncertainty in the processed test feature, we adaptively apply the detection and imputation procedures to the calibration set, thereby constructing exchangeable features for the conformal prediction interval of the test label. We develop two practical algorithms, PDI-CP and JDI-CP, and provide a distribution-free coverage analysis under some commonly used detection and imputation procedures. Notably, JDI-CP achieves a finite sample $1-2\alpha$ coverage guarantee. Numerical experiments on both synthetic and real datasets demonstrate that our proposed algorithms exhibit robust coverage properties and comparable efficiency to the oracle baseline.
Related papers
- Calibrated Prediction Set in Fault Detection with Risk Guarantees via Significance Tests [3.500936878570599]
This paper proposes a novel fault detection method that integrates significance testing with the conformal prediction framework to provide formal risk guarantees.<n>The proposed method consistently achieves an empirical coverage rate at or above the nominal level ($1-alpha$)<n>The results reveal a controllable trade-off between the user-defined risk level ($alpha$) and efficiency, where higher risk tolerance leads to smaller average prediction set sizes.
arXiv Detail & Related papers (2025-08-02T05:49:02Z) - COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z) - Conformal Prediction Sets with Improved Conditional Coverage using Trust Scores [52.92618442300405]
It is impossible to achieve exact, distribution-free conditional coverage in finite samples.<n>We propose an alternative conformal prediction algorithm that targets coverage where it matters most.
arXiv Detail & Related papers (2025-01-17T12:01:56Z) - Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.<n>Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.<n>We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Online scalable Gaussian processes with conformal prediction for guaranteed coverage [32.21093722162573]
The consistency of the resulting uncertainty values hinges on the premise that the learning function conforms to the properties specified by the GP model.
We propose to wed the GP with the prevailing conformal prediction (CP), a distribution-free post-processing framework that produces it prediction sets with a provably valid coverage.
arXiv Detail & Related papers (2024-10-07T19:22:15Z) - Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering [55.15192437680943]
Generative models lack rigorous statistical guarantees for their outputs.<n>We propose a sequential conformal prediction method producing prediction sets that satisfy a rigorous statistical guarantee.<n>This guarantee states that with high probability, the prediction sets contain at least one admissible (or valid) example.
arXiv Detail & Related papers (2024-10-02T15:26:52Z) - Adapting Conformal Prediction to Distribution Shifts Without Labels [16.478151550456804]
Conformal prediction (CP) enables machine learning models to output prediction sets with guaranteed coverage rate.
Our goal is to improve the quality of CP-generated prediction sets using only unlabeled data from the test domain.
This is achieved by two new methods called ECP and EACP, that adjust the score function in CP according to the base model's uncertainty on the unlabeled test data.
arXiv Detail & Related papers (2024-06-03T15:16:02Z) - Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample.
Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications.
We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z) - Confidence on the Focal: Conformal Prediction with Selection-Conditional Coverage [6.010965256037659]
Conformal prediction builds marginally valid prediction intervals that cover the unknown outcome of a randomly drawn test point.<n>In practice, data-driven methods are often used to identify specific test unit(s) of interest.<n>This paper presents a general framework for constructing a prediction set with finite-sample exact coverage.
arXiv Detail & Related papers (2024-03-06T17:18:24Z) - Adaptive novelty detection with false discovery rate guarantee [1.8249324194382757]
We propose a flexible method to control the false discovery rate (FDR) on detected novelties in finite samples.
Inspired by the multiple testing literature, we propose variants of AdaDetect that are adaptive to the proportion of nulls.
The methods are illustrated on synthetic datasets and real-world datasets, including an application in astrophysics.
arXiv Detail & Related papers (2022-08-13T17:14:55Z) - Approximate Conditional Coverage via Neural Model Approximations [0.030458514384586396]
We analyze a data-driven procedure for obtaining empirically reliable approximate conditional coverage.
We demonstrate the potential for substantial (and otherwise unknowable) under-coverage with split-conformal alternatives with marginal coverage guarantees.
arXiv Detail & Related papers (2022-05-28T02:59:05Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.