Empirically Validating Conformal Prediction on Modern Vision
Architectures Under Distribution Shift and Long-tailed Data
- URL: http://arxiv.org/abs/2307.01088v1
- Date: Mon, 3 Jul 2023 15:08:28 GMT
- Title: Empirically Validating Conformal Prediction on Modern Vision
Architectures Under Distribution Shift and Long-tailed Data
- Authors: Kevin Kasa and Graham W. Taylor
- Abstract summary: Conformal prediction has emerged as a rigorous means of providing deep learning models with reliable uncertainty estimates and safety guarantees.
Here, we characterize the performance of several post-hoc and training-based conformal prediction methods under distribution shifts and long-tailed class distributions.
We show that across numerous conformal methods and neural network families, performance greatly degrades under distribution shifts violating safety guarantees.
- Score: 18.19171031755595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conformal prediction has emerged as a rigorous means of providing deep
learning models with reliable uncertainty estimates and safety guarantees. Yet,
its performance is known to degrade under distribution shift and long-tailed
class distributions, which are often present in real world applications. Here,
we characterize the performance of several post-hoc and training-based
conformal prediction methods under these settings, providing the first
empirical evaluation on large-scale datasets and models. We show that across
numerous conformal methods and neural network families, performance greatly
degrades under distribution shifts violating safety guarantees. Similarly, we
show that in long-tailed settings the guarantees are frequently violated on
many classes. Understanding the limitations of these methods is necessary for
deployment in real world and safety-critical applications.
Related papers
- Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - SURE: SUrvey REcipes for building reliable and robust deep networks [12.268921703825258]
In this paper, we revisit techniques for uncertainty estimation within deep neural networks and consolidate a suite of techniques to enhance their reliability.
We rigorously evaluate SURE against the benchmark of failure prediction, a critical testbed for uncertainty estimation efficacy.
When applied to real-world challenges, such as data corruption, label noise, and long-tailed class distribution, SURE exhibits remarkable robustness, delivering results that are superior or on par with current state-of-the-art specialized methods.
arXiv Detail & Related papers (2024-03-01T13:58:19Z) - Multiclass Alignment of Confidence and Certainty for Network Calibration [10.15706847741555]
Recent studies reveal that deep neural networks (DNNs) are prone to making overconfident predictions.
We propose a new train-time calibration method, which features a simple, plug-and-play auxiliary loss known as multi-class alignment of predictive mean confidence and predictive certainty (MACC)
Our method achieves state-of-the-art calibration performance for both in-domain and out-domain predictions.
arXiv Detail & Related papers (2023-09-06T00:56:24Z) - Certified Adversarial Defenses Meet Out-of-Distribution Corruptions:
Benchmarking Robustness and Simple Baselines [65.0803400763215]
This work critically examines how adversarial robustness guarantees change when state-of-the-art certifiably robust models encounter out-of-distribution data.
We propose a novel data augmentation scheme, FourierMix, that produces augmentations to improve the spectral coverage of the training data.
We find that FourierMix augmentations help eliminate the spectral bias of certifiably robust models enabling them to achieve significantly better robustness guarantees on a range of OOD benchmarks.
arXiv Detail & Related papers (2021-12-01T17:11:22Z) - Locally Valid and Discriminative Confidence Intervals for Deep Learning
Models [37.57296694423751]
Uncertainty information should be valid (guaranteeing coverage) and discriminative (more uncertain when the expected risk is high)
Most existing Bayesian methods lack frequentist coverage guarantees and usually affect model performance.
We propose Locally Valid and Discriminative confidence intervals (LVD), a simple, efficient and lightweight method to construct discriminative confidence intervals (CIs) for almost any deep learning model.
arXiv Detail & Related papers (2021-06-01T04:39:56Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Empirical Frequentist Coverage of Deep Learning Uncertainty
Quantification Procedures [13.890139530120164]
We provide the first large scale evaluation of the empirical frequentist coverage properties of uncertainty quantification techniques.
We find that, in general, some methods do achieve desirable coverage properties on in distribution samples, but that coverage is not maintained on out-of-distribution data.
arXiv Detail & Related papers (2020-10-06T21:22:46Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Robust Validation: Confident Predictions Even When Distributions Shift [19.327409270934474]
We describe procedures for robust predictive inference, where a model provides uncertainty estimates on its predictions rather than point predictions.
We present a method that produces prediction sets (almost exactly) giving the right coverage level for any test distribution in an $f$-divergence ball around the training population.
An essential component of our methodology is to estimate the amount of expected future data shift and build robustness to it.
arXiv Detail & Related papers (2020-08-10T17:09:16Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.