Related papers: On Subjective Uncertainty Quantification and Calibration in Natural Language Generation

On Subjective Uncertainty Quantification and Calibration in Natural Language Generation

URL: http://arxiv.org/abs/2406.05213v1
Date: Fri, 7 Jun 2024 18:54:40 GMT
Title: On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
Authors: Ziyu Wang, Chris Holmes,
Abstract summary: Large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging. This work addresses these challenges from a perspective of Bayesian decision theory, starting from the assumption that our utility is characterized by a similarity measure. We demonstrate the proposed methods on question answering and machine translation tasks, where they extract broadly meaningful uncertainty estimates.
Score: 2.622066970118316
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Applications of large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging. This is due to the need to identify task-specific uncertainties (e.g., about the semantics) which appears difficult to define in general cases. This work addresses these challenges from a perspective of Bayesian decision theory, starting from the assumption that our utility is characterized by a similarity measure that compares a generated response with a hypothetical true response. We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration. We further derive a measure for epistemic uncertainty, based on a missing data perspective and its characterization as an excess risk. The proposed measures can be applied to black-box language models. We demonstrate the proposed methods on question answering and machine translation tasks, where they extract broadly meaningful uncertainty estimates from GPT and Gemini models and quantify their calibration.

Related papers

Token-Level Uncertainty Estimation for Large Language Model Reasoning [24.56760223952017]
Large Language Models (LLMs) have demonstrated impressive capabilities, but their output quality remains inconsistent across various application scenarios.<n>We propose a token-level uncertainty estimation framework to enable LLMs to self-assess and self-improve their generation quality in mathematical reasoning.
arXiv Detail & Related papers (2025-05-16T22:47:32Z)
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity [41.45862052156885]
We propose a unified framework that connects abstention and linguistic calibration through the lens of linguistic pragmatics. We describe an implementation that allows for controlling the level of imprecision in model responses. Our approach enables fine-tuning models to perform uncertainty-aware adaptive claim rewriting, offering a controllable balance between factuality and specificity.
arXiv Detail & Related papers (2025-02-26T13:01:49Z)
A statistically consistent measure of semantic uncertainty using Language Models [3.4933610074113464]
We propose a novel measure of semantic uncertainty, semantic spectral entropy, that is statistically consistent under mild assumptions.<n>This measure is implemented through a straightforward algorithm that relies solely on standard, pretrained language models.
arXiv Detail & Related papers (2025-02-01T17:55:58Z)
DiverseAgentEntropy: Quantifying Black-Box LLM Uncertainty through Diverse Perspectives and Multi-Agent Interaction [53.803276766404494]
Existing methods, which gauge a model's uncertainty through evaluating self-consistency in responses to the original query, do not always capture true uncertainty. We propose a novel method, DiverseAgentEntropy, for evaluating a model's uncertainty using multi-agent interaction. Our method offers a more accurate prediction of the model's reliability and further detects hallucinations, outperforming other self-consistency-based methods.
arXiv Detail & Related papers (2024-12-12T18:52:40Z)
A Review of Bayesian Uncertainty Quantification in Deep Probabilistic Image Segmentation [0.0]
Advancements in image segmentation play an integral role within the greater scope of Deep Learning-based computer vision. Uncertainty quantification has been extensively studied within this context, enabling expression of model ignorance (epistemic uncertainty) or data ambiguity (aleatoric uncertainty) to prevent uninformed decision making. This work provides a comprehensive overview of probabilistic segmentation by discussing fundamental concepts in uncertainty that govern advancements in the field and the application to various tasks.
arXiv Detail & Related papers (2024-11-25T13:26:09Z)
On Uncertainty In Natural Language Processing [2.5076643086429993]
This thesis studies how uncertainty in natural language processing can be characterized from a linguistic, statistical and neural perspective. We propose a method for calibrated sampling in natural language generation based on non-exchangeable conformal prediction. Lastly, we develop an approach to quantify confidence in large black-box language models using auxiliary predictors.
arXiv Detail & Related papers (2024-10-04T14:08:02Z)
Unconditional Truthfulness: Learning Conditional Dependency for Uncertainty Quantification of Large Language Models [96.43562963756975]
We train a regression model, which target variable is the gap between the conditional and the unconditional generation confidence. We use this learned conditional dependency model to modulate the uncertainty of the current generation step based on the uncertainty of the previous step.
arXiv Detail & Related papers (2024-08-20T09:42:26Z)
To Believe or Not to Believe Your LLM [51.2579827761899]
We explore uncertainty quantification in large language models (LLMs) We derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large. We conduct a series of experiments which demonstrate the advantage of our formulation.
arXiv Detail & Related papers (2024-06-04T17:58:18Z)
Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties. The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination" We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z)
Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging [8.752563431501502]
This paper introduces Bayesian uncertainty modeling using Weight Averaging-Gaussian (SWAG) in Natural Language Understanding (NLU) tasks. We demonstrate the effectiveness of the method in terms of prediction accuracy and correlation with human annotation disagreements.
arXiv Detail & Related papers (2023-04-10T17:37:23Z)
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation [37.37606905433334]
We show that measuring uncertainty in natural language is challenging because of "semantic equivalence" We introduce semantic entropy -- an entropy which incorporates linguistic invariances created by shared meanings. Our method is unsupervised, uses only a single model, and requires no modifications to off-the-shelf language models.
arXiv Detail & Related papers (2023-02-19T20:10:07Z)
Dense Uncertainty Estimation via an Ensemble-based Conditional Latent Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model. We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation. Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.