What is Flagged in Uncertainty Quantification? Latent Density Models for
Uncertainty Categorization
- URL: http://arxiv.org/abs/2207.05161v2
- Date: Fri, 27 Oct 2023 19:09:59 GMT
- Title: What is Flagged in Uncertainty Quantification? Latent Density Models for
Uncertainty Categorization
- Authors: Hao Sun, Boris van Breugel, Jonathan Crabbe, Nabeel Seedat, Mihaela
van der Schaar
- Abstract summary: Uncertainty Quantification (UQ) is essential for creating trustworthy machine learning models.
Recent years have seen a steep rise in UQ methods that can flag suspicious examples.
We propose a framework for categorizing uncertain examples flagged by UQ methods in classification tasks.
- Score: 68.15353480798244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Uncertainty Quantification (UQ) is essential for creating trustworthy machine
learning models. Recent years have seen a steep rise in UQ methods that can
flag suspicious examples, however, it is often unclear what exactly these
methods identify. In this work, we propose a framework for categorizing
uncertain examples flagged by UQ methods in classification tasks. We introduce
the confusion density matrix -- a kernel-based approximation of the
misclassification density -- and use this to categorize suspicious examples
identified by a given uncertainty method into three classes:
out-of-distribution (OOD) examples, boundary (Bnd) examples, and examples in
regions of high in-distribution misclassification (IDM). Through extensive
experiments, we show that our framework provides a new and distinct perspective
for assessing differences between uncertainty quantification methods, thereby
forming a valuable assessment benchmark.
Related papers
- Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - One step closer to unbiased aleatoric uncertainty estimation [71.55174353766289]
We propose a new estimation method by actively de-noising the observed data.
By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.
arXiv Detail & Related papers (2023-12-16T14:59:11Z) - Uncertainty in Additive Feature Attribution methods [34.80932512496311]
We focus on the class of additive feature attribution explanation methods.
We study the relationship between a feature's attribution and its uncertainty and observe little correlation.
We coin the term "stable instances" for such instances and diagnose factors that make an instance stable.
arXiv Detail & Related papers (2023-11-29T08:40:46Z) - A Data-Driven Measure of Relative Uncertainty for Misclassification
Detection [25.947610541430013]
We introduce a data-driven measure of uncertainty relative to an observer for misclassification detection.
By learning patterns in the distribution of soft-predictions, our uncertainty measure can identify misclassified samples.
We demonstrate empirical improvements over multiple image classification tasks, outperforming state-of-the-art misclassification detection methods.
arXiv Detail & Related papers (2023-06-02T17:32:03Z) - Benchmarking common uncertainty estimation methods with
histopathological images under domain shift and label noise [62.997667081978825]
In high-risk environments, deep learning models need to be able to judge their uncertainty and reject inputs when there is a significant chance of misclassification.
We conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole Slide Images.
We observe that ensembles of methods generally lead to better uncertainty estimates as well as an increased robustness towards domain shifts and label noise.
arXiv Detail & Related papers (2023-01-03T11:34:36Z) - Uncertainty-Aware Reliable Text Classification [21.517852608625127]
Deep neural networks have significantly contributed to the success in predictive accuracy for classification tasks.
They tend to make over-confident predictions in real-world settings, where domain shifting and out-of-distribution examples exist.
We propose an inexpensive framework that adopts both auxiliary outliers and pseudo off-manifold samples to train the model with prior knowledge of a certain class.
arXiv Detail & Related papers (2021-07-15T04:39:55Z) - Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep
Learning [70.72363097550483]
In this study, we focus on in-domain uncertainty for image classification.
To provide more insight in this study, we introduce the deep ensemble equivalent score (DEE)
arXiv Detail & Related papers (2020-02-15T23:28:19Z) - On Last-Layer Algorithms for Classification: Decoupling Representation
from Uncertainty Estimation [27.077741143188867]
We propose a family of algorithms which split the classification task into two stages: representation learning and uncertainty estimation.
We evaluate their performance in terms of emphselective classification (risk-coverage), and their ability to detect out-of-distribution samples.
arXiv Detail & Related papers (2020-01-22T15:08:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.