Related papers: Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models

Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models

URL: http://arxiv.org/abs/2510.17028v1
Date: Sun, 19 Oct 2025 22:28:57 GMT
Title: Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models
Authors: Kyle Cox, Jiawei Xu, Yikun Han, Rong Xu, Tianhao Li, Chi-Yang Hsu, Tianlong Chen, Walter Gerych, Ying Ding,
Abstract summary: We study prompt sensitivity in large language models (LLMs)<n>We show that sampling across the semantic concept space'' with paraphrasing perturbations improves uncertainty calibration without compromising accuracy.
Score: 39.05891782057066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: An interesting behavior in large language models (LLMs) is prompt sensitivity. When provided with different but semantically equivalent versions of the same prompt, models may produce very different distributions of answers. This suggests that the uncertainty reflected in a model's output distribution for one prompt may not reflect the model's uncertainty about the meaning of the prompt. We model prompt sensitivity as a type of generalization error, and show that sampling across the semantic ``concept space'' with paraphrasing perturbations improves uncertainty calibration without compromising accuracy. Additionally, we introduce a new metric for uncertainty decomposition in black-box LLMs that improves upon entropy-based decomposition by modeling semantic continuities in natural language generation. We show that this decomposition metric can be used to quantify how much LLM uncertainty is attributed to prompt sensitivity. Our work introduces a new way to improve uncertainty calibration in prompt-sensitive language models, and provides evidence that some LLMs fail to exhibit consistent general reasoning about the meanings of their inputs.

Related papers

Efficient semantic uncertainty quantification in language models via diversity-steered sampling [46.23327887393273]
We introduce a diversity-steered sampler that discourages semantically redundant outputs during decoding.<n>Key idea is to inject a continuous semantic-similarity penalty into the model's proposal distribution.<n>Being modular and requiring no gradient access to the base LLM, the framework promises to serve as a drop-in enhancement for uncertainty estimation.
arXiv Detail & Related papers (2025-10-24T10:06:21Z)
Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification [9.397157329808254]
MUSE is a simple information-theoretic method to identify and aggregate well-calibrated subsets of large language models.<n> Experiments on binary prediction tasks demonstrate improved calibration and predictive performance compared to single-model and na"ive ensemble baselines.
arXiv Detail & Related papers (2025-07-09T19:13:25Z)
Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings [11.33157177182775]
Accurately quantifying uncertainty in large language models (LLMs) is crucial for their reliable deployment. Current state-of-the-art methods for measuring semantic uncertainty in LLMs rely on strict bidirectional entailment criteria. We propose a novel approach that leverages semantic embeddings to achieve smoother and more robust estimation of semantic uncertainty.
arXiv Detail & Related papers (2024-10-30T04:41:46Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important. We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z)
Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties. The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z)
Distinguishing the Knowable from the Unknowable with Language Models [15.471748481627143]
In the absence of ground-truth probabilities, we explore a setting where, in order to disentangle a given uncertainty, a significantly larger model stands in as a proxy for the ground truth. We show that small linear probes trained on the embeddings of frozen, pretrained models accurately predict when larger models will be more confident at the token level. We propose a fully unsupervised method that achieves non-trivial accuracy on the same task.
arXiv Detail & Related papers (2024-02-05T22:22:49Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.