On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
- URL: http://arxiv.org/abs/2406.05213v1
- Date: Fri, 7 Jun 2024 18:54:40 GMT
- Title: On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
- Authors: Ziyu Wang, Chris Holmes,
- Abstract summary: Large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging.
This work addresses these challenges from a perspective of Bayesian decision theory, starting from the assumption that our utility is characterized by a similarity measure.
We demonstrate the proposed methods on question answering and machine translation tasks, where they extract broadly meaningful uncertainty estimates.
- Score: 2.622066970118316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Applications of large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging. This is due to the need to identify task-specific uncertainties (e.g., about the semantics) which appears difficult to define in general cases. This work addresses these challenges from a perspective of Bayesian decision theory, starting from the assumption that our utility is characterized by a similarity measure that compares a generated response with a hypothetical true response. We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration. We further derive a measure for epistemic uncertainty, based on a missing data perspective and its characterization as an excess risk. The proposed measures can be applied to black-box language models. We demonstrate the proposed methods on question answering and machine translation tasks, where they extract broadly meaningful uncertainty estimates from GPT and Gemini models and quantify their calibration.
Related papers
- On the Calibration of Epistemic Uncertainty: Principles, Paradoxes and Conflictual Loss [3.8248583585487155]
Evidential uncertainty is produced by Deep Ensembles, Bayesian Deep Networks, or Evidential Deep Networks.
Although measurable, this form of uncertainty is difficult to calibrate on an objective basis.
We propose a regularization function for deep ensembles, called conflictual loss in line with the above requirements.
arXiv Detail & Related papers (2024-07-16T23:21:28Z) - Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness [106.52630978891054]
We present a taxonomy of uncertainty specific to vision-language AI systems.
We also introduce a new metric confidence-weighted accuracy, that is well correlated with both accuracy and calibration error.
arXiv Detail & Related papers (2024-07-02T04:23:54Z) - To Believe or Not to Believe Your LLM [51.2579827761899]
We explore uncertainty quantification in large language models (LLMs)
We derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large.
We conduct a series of experiments which demonstrate the advantage of our formulation.
arXiv Detail & Related papers (2024-06-04T17:58:18Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Deep Evidential Learning for Bayesian Quantile Regression [3.6294895527930504]
It is desirable to have accurate uncertainty estimation from a single deterministic forward-pass model.
This paper proposes a deep Bayesian quantile regression model that can estimate the quantiles of a continuous target distribution without the Gaussian assumption.
arXiv Detail & Related papers (2023-08-21T11:42:16Z) - Conformal Prediction with Large Language Models for Multi-Choice
Question Answering [7.049780432343948]
We find that the uncertainty estimates from conformal prediction are tightly correlated with prediction accuracy.
This work contributes towards more trustworthy and reliable usage of large language models in safety-critical situations.
arXiv Detail & Related papers (2023-05-28T15:26:10Z) - Uncertainty-Aware Natural Language Inference with Stochastic Weight
Averaging [8.752563431501502]
This paper introduces Bayesian uncertainty modeling using Weight Averaging-Gaussian (SWAG) in Natural Language Understanding (NLU) tasks.
We demonstrate the effectiveness of the method in terms of prediction accuracy and correlation with human annotation disagreements.
arXiv Detail & Related papers (2023-04-10T17:37:23Z) - Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation
in Natural Language Generation [37.37606905433334]
We show that measuring uncertainty in natural language is challenging because of "semantic equivalence"
We introduce semantic entropy -- an entropy which incorporates linguistic invariances created by shared meanings.
Our method is unsupervised, uses only a single model, and requires no modifications to off-the-shelf language models.
arXiv Detail & Related papers (2023-02-19T20:10:07Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.