A Cautionary Tale of Decorrelating Theory Uncertainties
- URL: http://arxiv.org/abs/2109.08159v1
- Date: Thu, 16 Sep 2021 18:00:01 GMT
- Title: A Cautionary Tale of Decorrelating Theory Uncertainties
- Authors: Aishik Ghosh and Benjamin Nachman
- Abstract summary: We will discuss techniques to train machine learning classifiers that are independent of a given feature.
We will examine theory uncertainties, which typically do not have a statistical origin.
We will provide explicit examples of two-point (fragmentation modeling) and continuous (higher-order corrections) uncertainties where decorrelating significantly reduces the apparent uncertainty.
- Score: 0.5076419064097732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A variety of techniques have been proposed to train machine learning
classifiers that are independent of a given feature. While this can be an
essential technique for enabling background estimation, it may also be useful
for reducing uncertainties. We carefully examine theory uncertainties, which
typically do not have a statistical origin. We will provide explicit examples
of two-point (fragmentation modeling) and continuous (higher-order corrections)
uncertainties where decorrelating significantly reduces the apparent
uncertainty while the actual uncertainty is much larger. These results suggest
that caution should be taken when using decorrelation for these types of
uncertainties as long as we do not have a complete decomposition into
statistically meaningful components.
Related papers
- One step closer to unbiased aleatoric uncertainty estimation [71.55174353766289]
We propose a new estimation method by actively de-noising the observed data.
By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.
arXiv Detail & Related papers (2023-12-16T14:59:11Z) - A Data-Driven Measure of Relative Uncertainty for Misclassification
Detection [25.947610541430013]
We introduce a data-driven measure of uncertainty relative to an observer for misclassification detection.
By learning patterns in the distribution of soft-predictions, our uncertainty measure can identify misclassified samples.
We demonstrate empirical improvements over multiple image classification tasks, outperforming state-of-the-art misclassification detection methods.
arXiv Detail & Related papers (2023-06-02T17:32:03Z) - Uncertainty Estimates of Predictions via a General Bias-Variance
Decomposition [7.811916700683125]
We introduce a bias-variance decomposition for proper scores, giving rise to the Bregman Information as the variance term.
We showcase the practical relevance of this decomposition on several downstream tasks, including model ensembles and confidence regions.
arXiv Detail & Related papers (2022-10-21T21:24:37Z) - On the Difficulty of Epistemic Uncertainty Quantification in Machine
Learning: The Case of Direct Uncertainty Estimation through Loss Minimisation [8.298716599039501]
Uncertainty quantification has received increasing attention in machine learning.
The latter refers to the learner's (lack of) knowledge and appears to be especially difficult to measure and quantify.
We show that loss minimisation does not work for second-order predictors.
arXiv Detail & Related papers (2022-03-11T17:26:05Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - Multi-label Chaining with Imprecise Probabilities [0.0]
We present two different strategies to extend the classical multi-label chaining approach to handle imprecise probability estimates.
The main reasons one could have for using such estimations are (1) to make cautious predictions when a high uncertainty is detected in the chaining and (2) to make better precise predictions by avoiding biases caused in early decisions in the chaining.
Our experimental results on missing labels, which investigate how reliable these predictions are in both approaches, indicate that our approaches produce relevant cautiousness on those hard-to-predict instances where the precise models fail.
arXiv Detail & Related papers (2021-07-15T16:43:31Z) - Exploring Uncertainty in Deep Learning for Construction of Prediction
Intervals [27.569681578957645]
We explore the uncertainty in deep learning to construct prediction intervals.
We design a special loss function, which enables us to learn uncertainty without uncertainty label.
Our method correlates the construction of prediction intervals with the uncertainty estimation.
arXiv Detail & Related papers (2021-04-27T02:58:20Z) - DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner.
We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z) - Don't Just Blame Over-parametrization for Over-confidence: Theoretical
Analysis of Calibration in Binary Classification [58.03725169462616]
We show theoretically that over-parametrization is not the only reason for over-confidence.
We prove that logistic regression is inherently over-confident, in the realizable, under-parametrized setting.
Perhaps surprisingly, we also show that over-confidence is not always the case.
arXiv Detail & Related papers (2021-02-15T21:38:09Z) - Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via
Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals.
The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z) - Uncertainty quantification for nonconvex tensor completion: Confidence
intervals, heteroscedasticity and optimality [92.35257908210316]
We study the problem of estimating a low-rank tensor given incomplete and corrupted observations.
We find that it attains unimprovable rates $ell-2$ accuracy.
arXiv Detail & Related papers (2020-06-15T17:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.