Explaining Prediction Uncertainty of Pre-trained Language Models by
Detecting Uncertain Words in Inputs
- URL: http://arxiv.org/abs/2201.03742v1
- Date: Tue, 11 Jan 2022 02:04:50 GMT
- Title: Explaining Prediction Uncertainty of Pre-trained Language Models by
Detecting Uncertain Words in Inputs
- Authors: Hanjie Chen, Yangfeng Ji
- Abstract summary: This paper pushes a step further on explaining uncertain predictions of post-calibrated pre-trained language models.
We adapt two perturbation-based post-hoc interpretation methods, Leave-one-out and Sampling Shapley, to identify words in inputs that cause the uncertainty in predictions.
- Score: 21.594361495948316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating the predictive uncertainty of pre-trained language models is
important for increasing their trustworthiness in NLP. Although many previous
works focus on quantifying prediction uncertainty, there is little work on
explaining the uncertainty. This paper pushes a step further on explaining
uncertain predictions of post-calibrated pre-trained language models. We adapt
two perturbation-based post-hoc interpretation methods, Leave-one-out and
Sampling Shapley, to identify words in inputs that cause the uncertainty in
predictions. We test the proposed methods on BERT and RoBERTa with three tasks:
sentiment classification, natural language inference, and paraphrase
identification, in both in-domain and out-of-domain settings. Experiments show
that both methods consistently capture words in inputs that cause prediction
uncertainty.
Related papers
- Efficient Normalized Conformal Prediction and Uncertainty Quantification
for Anti-Cancer Drug Sensitivity Prediction with Deep Regression Forests [0.0]
Conformal Prediction has emerged as a promising method to pair machine learning models with prediction intervals.
We propose a method to estimate the uncertainty of each sample by calculating the variance obtained from a Deep Regression Forest.
arXiv Detail & Related papers (2024-02-21T19:09:53Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity.
The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z) - Conformalizing Machine Translation Evaluation [9.89901717499058]
Several uncertainty estimation methods have been recently proposed for machine translation evaluation.
We show that the majority of them tend to underestimate model uncertainty, and as a result they often produce misleading confidence intervals that do not cover the ground truth.
We propose as an alternative the use of conformal prediction, a distribution-free method to obtain confidence intervals with a theoretically established guarantee on coverage.
arXiv Detail & Related papers (2023-06-09T19:36:18Z) - CUE: An Uncertainty Interpretation Framework for Text Classifiers Built
on Pre-Trained Language Models [28.750894873827068]
We propose a novel framework, called CUE, which aims to interpret uncertainties inherent in the predictions of PLM-based models.
By comparing the difference in predictive uncertainty between the perturbed and the original text representations, we are able to identify the latent dimensions responsible for uncertainty.
arXiv Detail & Related papers (2023-06-06T11:37:46Z) - Integrating Uncertainty into Neural Network-based Speech Enhancement [27.868722093985006]
Supervised masking approaches in the time-frequency domain aim to employ deep neural networks to estimate a multiplicative mask to extract clean speech.
This leads to a single estimate for each input without any guarantees or measures of reliability.
We study the benefits of modeling uncertainty in clean speech estimation.
arXiv Detail & Related papers (2023-05-15T15:55:12Z) - Toward Reliable Human Pose Forecasting with Uncertainty [51.628234388046195]
We develop an open-source library for human pose forecasting, including multiple models, supporting several datasets.
We devise two types of uncertainty in the problem to increase performance and convey better trust.
arXiv Detail & Related papers (2023-04-13T17:56:08Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner.
We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z) - Getting a CLUE: A Method for Explaining Uncertainty Estimates [30.367995696223726]
We propose a novel method for interpreting uncertainty estimates from differentiable probabilistic models.
Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold.
arXiv Detail & Related papers (2020-06-11T21:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.