On the Role of Unobserved Sequences on Sample-based Uncertainty Quantification for LLMs
- URL: http://arxiv.org/abs/2510.04439v1
- Date: Mon, 06 Oct 2025 02:14:48 GMT
- Title: On the Role of Unobserved Sequences on Sample-based Uncertainty Quantification for LLMs
- Authors: Lucie Kunitomo-Jacquin, Edison Marrese-Taylor, Ken Fukuda,
- Abstract summary: Quantifying uncertainty in large language models (LLMs) is important for safety-critical applications.<n>We advocate and experimentally show that the probability of unobserved sequences plays a crucial role.
- Score: 9.370434677545907
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantifying uncertainty in large language models (LLMs) is important for safety-critical applications because it helps spot incorrect answers, known as hallucinations. One major trend of uncertainty quantification methods is based on estimating the entropy of the distribution of the LLM's potential output sequences. This estimation is based on a set of output sequences and associated probabilities obtained by querying the LLM several times. In this paper, we advocate and experimentally show that the probability of unobserved sequences plays a crucial role, and we recommend future research to integrate it to enhance such LLM uncertainty quantification methods.
Related papers
- Embedding Perturbation may Better Reflect the Uncertainty in LLM Reasoning [17.830165082895757]
Uncertainty Quantification (UQ) techniques are used to estimate a model's uncertainty about its outputs, indicating the likelihood that those outputs may be problematic.<n>For LLM reasoning tasks, it is essential to estimate the uncertainty not only for the final answer, but also for the intermediate steps of the reasoning, as this can enable more fine-grained and targeted interventions.<n>Our study reveals that an LLMs' incorrect reasoning steps tend to contain tokens which are highly sensitive to the perturbations on the preceding token embeddings.
arXiv Detail & Related papers (2026-02-02T18:27:26Z) - ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models [23.44710972442814]
Uncertainty Quantification (UQ) is a promising approach to improve model reliability, yet the uncertainty of Large Language Models (LLMs) is non-trivial.<n>We propose a novel grey-box uncertainty quantification method that measures the variation in model outputs before and after semantic-preserving intervention.
arXiv Detail & Related papers (2025-10-15T02:46:43Z) - CLUE: Concept-Level Uncertainty Estimation for Large Language Models [49.92690111618016]
We propose a novel framework for Concept-Level Uncertainty Estimation for Large Language Models (LLMs)
We leverage LLMs to convert output sequences into concept-level representations, breaking down sequences into individual concepts and measuring the uncertainty of each concept separately.
We conduct experiments to demonstrate that CLUE can provide more interpretable uncertainty estimation results compared with sentence-level uncertainty.
arXiv Detail & Related papers (2024-09-04T18:27:12Z) - Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models [104.55763564037831]
We train a regression model that leverages attention maps, probabilities on the current generation step, and recurrently computed uncertainty scores from previously generated tokens.<n>Our evaluation shows that the proposed method is highly effective for selective generation, achieving substantial improvements over rivaling unsupervised and supervised approaches.
arXiv Detail & Related papers (2024-08-20T09:42:26Z) - Question Rephrasing for Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks [4.167519875804914]
We present a novel Question Rephrasing technique to evaluate the input uncertainty of large language models (LLMs)
This technique is integrated with sampling methods that measure the output uncertainty of LLMs, thereby offering a more comprehensive uncertainty assessment.
arXiv Detail & Related papers (2024-08-07T12:38:23Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - To Believe or Not to Believe Your LLM [51.2579827761899]
We explore uncertainty quantification in large language models (LLMs)
We derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large.
We conduct a series of experiments which demonstrate the advantage of our formulation.
arXiv Detail & Related papers (2024-06-04T17:58:18Z) - Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important.
We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.