A Rate-Distortion View of Uncertainty Quantification
- URL: http://arxiv.org/abs/2406.10775v2
- Date: Tue, 18 Jun 2024 12:41:43 GMT
- Title: A Rate-Distortion View of Uncertainty Quantification
- Authors: Ifigeneia Apostolopoulou, Benjamin Eysenbach, Frank Nielsen, Artur Dubrawski,
- Abstract summary: In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction.
We introduce Distance Aware Bottleneck (DAB), a new method for enriching deep neural networks with this property.
- Score: 36.85921945174863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction. While powerful probabilistic models such as Gaussian Processes naturally have this property, deep neural networks often lack it. In this paper, we introduce Distance Aware Bottleneck (DAB), i.e., a new method for enriching deep neural networks with this property. Building on prior information bottleneck approaches, our method learns a codebook that stores a compressed representation of all inputs seen during training. The distance of a new example from this codebook can serve as an uncertainty estimate for the example. The resulting model is simple to train and provides deterministic uncertainty estimates by a single forward pass. Finally, our method achieves better out-of-distribution (OOD) detection and misclassification prediction than prior methods, including expensive ensemble methods, deep kernel Gaussian Processes, and approaches based on the standard information bottleneck.
Related papers
- Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection [36.4610463573214]
We propose an textbfOptimal textbfParameter and textbfNeuron textbfPruning (textbfOPNP) approach to detect out-of-distribution (OOD) samples.
Our proposal is training-free, compatible with other post-hoc methods, and exploring the information from all training data.
arXiv Detail & Related papers (2024-02-04T07:31:06Z) - Confidence estimation of classification based on the distribution of the
neural network output layer [4.529188601556233]
One of the most common problems preventing the application of prediction models in the real world is lack of generalization.
We propose novel methods that estimate uncertainty of particular predictions generated by a neural network classification model.
The proposed methods infer the confidence of a particular prediction based on the distribution of the logit values corresponding to this prediction.
arXiv Detail & Related papers (2022-10-14T12:32:50Z) - Look beyond labels: Incorporating functional summary information in
Bayesian neural networks [11.874130244353253]
We present a simple approach to incorporate summary information about the predicted probability.
The available summary information is incorporated as augmented data and modeled with a Dirichlet process.
We show how the method can inform the model about task difficulty or class imbalance.
arXiv Detail & Related papers (2022-07-04T07:06:45Z) - Rethinking Bayesian Learning for Data Analysis: The Art of Prior and
Inference in Sparsity-Aware Modeling [20.296566563098057]
Sparse modeling for signal processing and machine learning has been at the focus of scientific research for over two decades.
This article reviews some recent advances in incorporating sparsity-promoting priors into three popular data modeling tools.
arXiv Detail & Related papers (2022-05-28T00:43:52Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Uncertainty-Aware Deep Classifiers using Generative Models [7.486679152591502]
Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions.
Some recent approaches quantify uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution.
We develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions.
arXiv Detail & Related papers (2020-06-07T15:38:35Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.