Locally Valid and Discriminative Confidence Intervals for Deep Learning
Models
- URL: http://arxiv.org/abs/2106.00225v1
- Date: Tue, 1 Jun 2021 04:39:56 GMT
- Title: Locally Valid and Discriminative Confidence Intervals for Deep Learning
Models
- Authors: Zhen Lin, Shubhendu Trivedi, Jimeng Sun
- Abstract summary: Uncertainty information should be valid (guaranteeing coverage) and discriminative (more uncertain when the expected risk is high)
Most existing Bayesian methods lack frequentist coverage guarantees and usually affect model performance.
We propose Locally Valid and Discriminative confidence intervals (LVD), a simple, efficient and lightweight method to construct discriminative confidence intervals (CIs) for almost any deep learning model.
- Score: 37.57296694423751
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Crucial for building trust in deep learning models for critical real-world
applications is efficient and theoretically sound uncertainty quantification, a
task that continues to be challenging. Useful uncertainty information is
expected to have two key properties: It should be valid (guaranteeing coverage)
and discriminative (more uncertain when the expected risk is high). Moreover,
when combined with deep learning (DL) methods, it should be scalable and affect
the DL model performance minimally. Most existing Bayesian methods lack
frequentist coverage guarantees and usually affect model performance. The few
available frequentist methods are rarely discriminative and/or violate coverage
guarantees due to unrealistic assumptions. Moreover, many methods are expensive
or require substantial modifications to the base neural network. Building upon
recent advances in conformal prediction and leveraging the classical idea of
kernel regression, we propose Locally Valid and Discriminative confidence
intervals (LVD), a simple, efficient and lightweight method to construct
discriminative confidence intervals (CIs) for almost any DL model. With no
assumptions on the data distribution, such CIs also offer finite-sample local
coverage guarantees (contrasted to the simpler marginal coverage). Using a
diverse set of datasets, we empirically verify that besides being the only
locally valid method, LVD also exceeds or matches the performance (including
coverage rate and prediction accuracy) of existing uncertainty quantification
methods, while offering additional benefits in scalability and flexibility.
Related papers
- SURE: SUrvey REcipes for building reliable and robust deep networks [12.268921703825258]
In this paper, we revisit techniques for uncertainty estimation within deep neural networks and consolidate a suite of techniques to enhance their reliability.
We rigorously evaluate SURE against the benchmark of failure prediction, a critical testbed for uncertainty estimation efficacy.
When applied to real-world challenges, such as data corruption, label noise, and long-tailed class distribution, SURE exhibits remarkable robustness, delivering results that are superior or on par with current state-of-the-art specialized methods.
arXiv Detail & Related papers (2024-03-01T13:58:19Z) - Empirically Validating Conformal Prediction on Modern Vision
Architectures Under Distribution Shift and Long-tailed Data [18.19171031755595]
Conformal prediction has emerged as a rigorous means of providing deep learning models with reliable uncertainty estimates and safety guarantees.
Here, we characterize the performance of several post-hoc and training-based conformal prediction methods under distribution shifts and long-tailed class distributions.
We show that across numerous conformal methods and neural network families, performance greatly degrades under distribution shifts violating safety guarantees.
arXiv Detail & Related papers (2023-07-03T15:08:28Z) - How Reliable is Your Regression Model's Uncertainty Under Real-World
Distribution Shifts? [46.05502630457458]
We propose a benchmark of 8 image-based regression datasets with different types of challenging distribution shifts.
We find that while methods are well calibrated when there is no distribution shift, they all become highly overconfident on many of the benchmark datasets.
arXiv Detail & Related papers (2023-02-07T18:54:39Z) - Quantifying Model Uncertainty for Semantic Segmentation using Operators
in the RKHS [20.348825818435767]
We present a framework for high-resolution predictive uncertainty quantification of semantic segmentation models.
We use a multi-moment functional definition of uncertainty associated with the model's feature space in the kernel reproducing the Hilbert space (RKHS)
This leads to a significantly more accurate view of model uncertainty than conventional Bayesian methods.
arXiv Detail & Related papers (2022-11-03T17:10:49Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data.
It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via
Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals.
The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.