Shapley Uncertainty in Natural Language Generation
- URL: http://arxiv.org/abs/2507.21406v1
- Date: Tue, 29 Jul 2025 00:26:33 GMT
- Title: Shapley Uncertainty in Natural Language Generation
- Authors: Meilin Zhu, Gaojie Jin, Xiaowei Huang, Lijun Zhang,
- Abstract summary: In question-answering tasks, determining when to trust the outputs is crucial to the alignment of large language models.<n>We propose a Shapley-based uncertainty metric that captures the continuous nature of semantic relationships.
- Score: 18.180786418608246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In question-answering tasks, determining when to trust the outputs is crucial to the alignment of large language models (LLMs). Kuhn et al. (2023) introduces semantic entropy as a measure of uncertainty, by incorporating linguistic invariances from the same meaning. It primarily relies on setting threshold to measure the level of semantic equivalence relation. We propose a more nuanced framework that extends beyond such thresholding by developing a Shapley-based uncertainty metric that captures the continuous nature of semantic relationships. We establish three fundamental properties that characterize valid uncertainty metrics and prove that our Shapley uncertainty satisfies these criteria. Through extensive experiments, we demonstrate that our Shapley uncertainty more accurately predicts LLM performance in question-answering and other datasets, compared to similar baseline measures.
Related papers
- Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation [68.106428321492]
Large language models (LLMs) demonstrate advanced reasoning abilities, enabling robots to understand natural language instructions and generate high-level plans with appropriate grounding.<n>LLMs hallucinations present a significant challenge, often leading to overconfident yet potentially misaligned or unsafe plans.<n>We present Combined Uncertainty estimation for Reliable Embodied planning (CURE), which decomposes the uncertainty into epistemic and intrinsic uncertainty, each estimated separately.
arXiv Detail & Related papers (2025-10-09T10:26:58Z) - Fine-Grained Uncertainty Decomposition in Large Language Models: A Spectral Approach [32.528332797693984]
We introduce Spectral Uncertainty, a novel approach to quantifying and decomposing uncertainties in Large Language Models.<n>Unlike existing baseline methods, our approach incorporates a fine-grained representation of semantic similarity.<n> Empirical evaluations demonstrate that Spectral Uncertainty outperforms state-of-the-art methods in estimating both aleatoric and total uncertainty.
arXiv Detail & Related papers (2025-09-26T12:39:10Z) - Variational Uncertainty Decomposition for In-Context Learning [16.986925734554447]
We introduce a variational uncertainty decomposition framework for in-context learning.<n>We optimise auxiliary queries as probes to obtain an upper bound to the aleatoric uncertainty of an in-context learning procedure.
arXiv Detail & Related papers (2025-09-02T13:53:09Z) - Token-Level Uncertainty Estimation for Large Language Model Reasoning [24.56760223952017]
Large Language Models (LLMs) have demonstrated impressive capabilities, but their output quality remains inconsistent across various application scenarios.<n>We propose a token-level uncertainty estimation framework to enable LLMs to self-assess and self-improve their generation quality in mathematical reasoning.
arXiv Detail & Related papers (2025-05-16T22:47:32Z) - SConU: Selective Conformal Uncertainty in Large Language Models [59.25881667640868]
We propose a novel approach termed Selective Conformal Uncertainty (SConU)<n>We develop two conformal p-values that are instrumental in determining whether a given sample deviates from the uncertainty distribution of the calibration set at a specific manageable risk level.<n>Our approach not only facilitates rigorous management of miscoverage rates across both single-domain and interdisciplinary contexts, but also enhances the efficiency of predictions.
arXiv Detail & Related papers (2025-04-19T03:01:45Z) - Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence [16.311538811237536]
Large language models (LLMs) are increasingly used for factual question-answering.<n>For these verbalized expressions of uncertainty to be meaningful, they should reflect the error rates at the expressed level of confidence.<n>We propose a simple procedure, uncertainty distillation, to teach an LLM to calibrated semantic confidences.
arXiv Detail & Related papers (2025-03-18T21:29:29Z) - Probabilistic Modeling of Disparity Uncertainty for Robust and Efficient Stereo Matching [61.73532883992135]
We propose a new uncertainty-aware stereo matching framework.<n>We adopt Bayes risk as the measurement of uncertainty and use it to separately estimate data and model uncertainty.
arXiv Detail & Related papers (2024-12-24T23:28:20Z) - Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings [11.33157177182775]
Accurately quantifying uncertainty in large language models (LLMs) is crucial for their reliable deployment.
Current state-of-the-art methods for measuring semantic uncertainty in LLMs rely on strict bidirectional entailment criteria.
We propose a novel approach that leverages semantic embeddings to achieve smoother and more robust estimation of semantic uncertainty.
arXiv Detail & Related papers (2024-10-30T04:41:46Z) - Unconditional Truthfulness: Learning Conditional Dependency for Uncertainty Quantification of Large Language Models [96.43562963756975]
We train a regression model, which target variable is the gap between the conditional and the unconditional generation confidence.
We use this learned conditional dependency model to modulate the uncertainty of the current generation step based on the uncertainty of the previous step.
arXiv Detail & Related papers (2024-08-20T09:42:26Z) - On Subjective Uncertainty Quantification and Calibration in Natural Language Generation [2.622066970118316]
Large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging.
This work addresses these challenges from a perspective of Bayesian decision theory.
We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration.
The proposed methods can be applied to black-box language models.
arXiv Detail & Related papers (2024-06-07T18:54:40Z) - Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important.
We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z) - Uncertainty in Language Models: Assessment through Rank-Calibration [65.10149293133846]
Language Models (LMs) have shown promising performance in natural language generation.
It is crucial to correctly quantify their uncertainty in responding to given inputs.
We develop a novel and practical framework, termed $Rank$-$Calibration$, to assess uncertainty and confidence measures for LMs.
arXiv Detail & Related papers (2024-04-04T02:31:05Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Ambiguity Meets Uncertainty: Investigating Uncertainty Estimation for
Word Sense Disambiguation [5.55197751179213]
Existing supervised methods treat WSD as a classification task and have achieved remarkable performance.
This paper extensively studies uncertainty estimation (UE) on the benchmark designed for WSD.
We examine the capability of capturing data and model uncertainties by the model with the selected UE score on well-designed test scenarios and discover that the model reflects data uncertainty satisfactorily but underestimates model uncertainty.
arXiv Detail & Related papers (2023-05-22T15:18:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.