Understanding Uncertainty Sampling
- URL: http://arxiv.org/abs/2307.02719v3
- Date: Thu, 20 Jul 2023 17:17:54 GMT
- Title: Understanding Uncertainty Sampling
- Authors: Shang Liu, Xiaocheng Li
- Abstract summary: Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples.
We propose a notion of equivalent loss which depends on the used uncertainty measure and the original loss function.
We provide the first generalization bound for uncertainty sampling algorithms under both stream-based and pool-based settings.
- Score: 7.32527270949303
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Uncertainty sampling is a prevalent active learning algorithm that queries
sequentially the annotations of data samples which the current prediction model
is uncertain about. However, the usage of uncertainty sampling has been largely
heuristic: (i) There is no consensus on the proper definition of "uncertainty"
for a specific task under a specific loss; (ii) There is no theoretical
guarantee that prescribes a standard protocol to implement the algorithm, for
example, how to handle the sequentially arrived annotated data under the
framework of optimization algorithms such as stochastic gradient descent. In
this work, we systematically examine uncertainty sampling algorithms under both
stream-based and pool-based active learning. We propose a notion of equivalent
loss which depends on the used uncertainty measure and the original loss
function and establish that an uncertainty sampling algorithm essentially
optimizes against such an equivalent loss. The perspective verifies the
properness of existing uncertainty measures from two aspects: surrogate
property and loss convexity. Furthermore, we propose a new notion for designing
uncertainty measures called \textit{loss as uncertainty}. The idea is to use
the conditional expected loss given the features as the uncertainty measure.
Such an uncertainty measure has nice analytical properties and generality to
cover both classification and regression problems, which enable us to provide
the first generalization bound for uncertainty sampling algorithms under both
stream-based and pool-based settings, in the full generality of the underlying
model and problem. Lastly, we establish connections between certain variants of
the uncertainty sampling algorithms with risk-sensitive objectives and
distributional robustness, which can partly explain the advantage of
uncertainty sampling algorithms when the sample size is small.
Related papers
- Calibrated Probabilistic Forecasts for Arbitrary Sequences [58.54729945445505]
Real-world data streams can change unpredictably due to distribution shifts, feedback loops and adversarial actors.
We present a forecasting framework ensuring valid uncertainty estimates regardless of how data evolves.
arXiv Detail & Related papers (2024-09-27T21:46:42Z) - Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods? [26.344949402398917]
This paper presents novel theoretical insights of evidential deep learning.
It highlights the difficulties in optimizing second-order loss functions.
It provides novel insights into issues of identifiability and convergence in second-order loss minimization.
arXiv Detail & Related papers (2024-02-14T10:07:05Z) - A Data-Driven Measure of Relative Uncertainty for Misclassification
Detection [25.947610541430013]
We introduce a data-driven measure of uncertainty relative to an observer for misclassification detection.
By learning patterns in the distribution of soft-predictions, our uncertainty measure can identify misclassified samples.
We demonstrate empirical improvements over multiple image classification tasks, outperforming state-of-the-art misclassification detection methods.
arXiv Detail & Related papers (2023-06-02T17:32:03Z) - Integrating Uncertainty into Neural Network-based Speech Enhancement [27.868722093985006]
Supervised masking approaches in the time-frequency domain aim to employ deep neural networks to estimate a multiplicative mask to extract clean speech.
This leads to a single estimate for each input without any guarantees or measures of reliability.
We study the benefits of modeling uncertainty in clean speech estimation.
arXiv Detail & Related papers (2023-05-15T15:55:12Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Uncertainty Quantification for Traffic Forecasting: A Unified Approach [21.556559649467328]
Uncertainty is an essential consideration for time series forecasting tasks.
In this work, we focus on quantifying the uncertainty of traffic forecasting.
We develop Deep S-Temporal Uncertainty Quantification (STUQ), which can estimate both aleatoric and relational uncertainty.
arXiv Detail & Related papers (2022-08-11T15:21:53Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings.
In this work, we propose a novel sampling-free uncertainty estimation method for object detection.
We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - The Aleatoric Uncertainty Estimation Using a Separate Formulation with
Virtual Residuals [51.71066839337174]
Existing methods can quantify the error in the target estimation, but they tend to underestimate it.
We propose a new separable formulation for the estimation of a signal and of its uncertainty, avoiding the effect of overfitting.
We demonstrate that the proposed method outperforms a state-of-the-art technique for signal and uncertainty estimation.
arXiv Detail & Related papers (2020-11-03T12:11:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.