Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks
- URL: http://arxiv.org/abs/2407.02138v1
- Date: Tue, 2 Jul 2024 10:33:31 GMT
- Title: Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks
- Authors: Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe,
- Abstract summary: $k$-Nearest Neighbor Uncertainty Estimation ($k$NN-UE) is an uncertainty estimation method that uses the distances from the neighbors and label-existence ratio of neighbors.
Our experiments show that our proposed method outperforms the baselines or recent density-based methods in confidence calibration, selective prediction, and out-of-distribution detection.
- Score: 26.336947440529713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trustworthy prediction in Deep Neural Networks (DNNs), including Pre-trained Language Models (PLMs) is important for safety-critical applications in the real world. However, DNNs often suffer from uncertainty estimation, such as miscalibration. In particular, approaches that require multiple stochastic inference can mitigate this problem, but the expensive cost of inference makes them impractical. In this study, we propose $k$-Nearest Neighbor Uncertainty Estimation ($k$NN-UE), which is an uncertainty estimation method that uses the distances from the neighbors and label-existence ratio of neighbors. Experiments on sentiment analysis, natural language inference, and named entity recognition show that our proposed method outperforms the baselines or recent density-based methods in confidence calibration, selective prediction, and out-of-distribution detection. Moreover, our analyses indicate that introducing dimension reduction or approximate nearest neighbor search inspired by recent $k$NN-LM studies reduces the inference overhead without significantly degrading estimation performance when combined them appropriately.
Related papers
- Confidence Intervals and Simultaneous Confidence Bands Based on Deep Learning [0.36832029288386137]
We provide a valid non-parametric bootstrap method that correctly disentangles data uncertainty from the noise inherent in the adopted optimization algorithm.
The proposed ad-hoc method can be easily integrated into any deep neural network without interfering with the training process.
arXiv Detail & Related papers (2024-06-20T05:51:37Z) - Scalable Subsampling Inference for Deep Neural Networks [0.0]
A non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator.
A non-random subsampling technique--scalable subsampling--is applied to construct a subagged' DNN estimator.
The proposed confidence/prediction intervals appear to work well in finite samples.
arXiv Detail & Related papers (2024-05-14T02:11:38Z) - Uncertainty in Language Models: Assessment through Rank-Calibration [65.10149293133846]
Language Models (LMs) have shown promising performance in natural language generation.
It is crucial to correctly quantify their uncertainty in responding to given inputs.
We develop a novel and practical framework, termed $Rank$-$Calibration$, to assess uncertainty and confidence measures for LMs.
arXiv Detail & Related papers (2024-04-04T02:31:05Z) - One step closer to unbiased aleatoric uncertainty estimation [71.55174353766289]
We propose a new estimation method by actively de-noising the observed data.
By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.
arXiv Detail & Related papers (2023-12-16T14:59:11Z) - Uncertainty-Aware Lidar Place Recognition in Novel Environments [11.30020653282995]
We investigate the task of uncertainty-aware lidar place recognition.
Each predicted place must have an associated uncertainty that can be used to identify and reject incorrect predictions.
We introduce a novel evaluation protocol and present the first comprehensive benchmark for this task.
arXiv Detail & Related papers (2022-10-04T04:06:44Z) - On Attacking Out-Domain Uncertainty Estimation in Deep Neural Networks [11.929914721626849]
We show that state-of-the-art uncertainty estimation algorithms could fail catastrophically under our proposed adversarial attack.
In particular, we aim at attacking the out-domain uncertainty estimation.
arXiv Detail & Related papers (2022-10-03T23:33:38Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - On the Minimal Adversarial Perturbation for Deep Neural Networks with
Provable Estimation Error [65.51757376525798]
The existence of adversarial perturbations has opened an interesting research line on provable robustness.
No provable results have been presented to estimate and bound the error committed.
This paper proposes two lightweight strategies to find the minimal adversarial perturbation.
The obtained results show that the proposed strategies approximate the theoretical distance and robustness for samples close to the classification, leading to provable guarantees against any adversarial attacks.
arXiv Detail & Related papers (2022-01-04T16:40:03Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings.
In this work, we propose a novel sampling-free uncertainty estimation method for object detection.
We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z) - Adversarial Attack for Uncertainty Estimation: Identifying Critical
Regions in Neural Networks [0.0]
We propose a novel method to capture data points near decision boundary in neural network that are often referred to a specific type of uncertainty.
Uncertainty estimates are derived from the input perturbations, unlike previous studies that provide perturbations on the model's parameters.
We show that the proposed method has revealed a significant outperformance over other methods and provided less risk to capture model uncertainty in machine learning.
arXiv Detail & Related papers (2021-07-15T21:30:26Z) - The Aleatoric Uncertainty Estimation Using a Separate Formulation with
Virtual Residuals [51.71066839337174]
Existing methods can quantify the error in the target estimation, but they tend to underestimate it.
We propose a new separable formulation for the estimation of a signal and of its uncertainty, avoiding the effect of overfitting.
We demonstrate that the proposed method outperforms a state-of-the-art technique for signal and uncertainty estimation.
arXiv Detail & Related papers (2020-11-03T12:11:27Z) - CoinDICE: Off-Policy Confidence Interval Estimation [107.86876722777535]
We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning.
We show in a variety of benchmarks that the confidence interval estimates are tighter and more accurate than existing methods.
arXiv Detail & Related papers (2020-10-22T12:39:11Z) - Probabilistic Neighbourhood Component Analysis: Sample Efficient
Uncertainty Estimation in Deep Learning [25.8227937350516]
We show that uncertainty estimation capability of state-of-the-art BNNs and Deep Ensemble models degrades significantly when the amount of training data is small.
We propose a probabilistic generalization of the popular sample-efficient non-parametric kNN approach.
Our approach enables deep kNN to accurately quantify underlying uncertainties in its prediction.
arXiv Detail & Related papers (2020-07-18T21:36:31Z) - Revisiting One-vs-All Classifiers for Predictive Uncertainty and
Out-of-Distribution Detection in Neural Networks [22.34227625637843]
We investigate how the parametrization of the probabilities in discriminative classifiers affects the uncertainty estimates.
We show that one-vs-all formulations can improve calibration on image classification tasks.
arXiv Detail & Related papers (2020-07-10T01:55:02Z) - An Empirical Evaluation on Robustness and Uncertainty of Regularization
Methods [43.25086015530892]
Deep neural networks (DNNs) behave fundamentally differently from humans.
They can easily change predictions when small corruptions such as blur are applied on the input.
They produce confident predictions on out-of-distribution samples (improper uncertainty measure)
arXiv Detail & Related papers (2020-03-09T01:15:22Z) - Estimating Uncertainty Intervals from Collaborating Networks [15.467208581231848]
We propose a novel method to capture predictive distributions in regression by defining two neural networks with two distinct loss functions.
Specifically, one network approximates the cumulative distribution function, and the second network approximates its inverse.
We benchmark CN against several common approaches on two synthetic and six real-world datasets, including forecasting A1c values in diabetic patients from electronic health records.
arXiv Detail & Related papers (2020-02-12T20:10:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.