Related papers: Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models

Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models

URL: http://arxiv.org/abs/2512.04351v1
Date: Thu, 04 Dec 2025 00:53:49 GMT
Title: Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models
Authors: Manh Nguyen, Sunil Gupta, Hung Le,
Abstract summary: We introduce bfRadial Dispersion Score (RDS), a simple, parameter-free, fully model-agnostic uncertainty metric.<n>RDS naturally extends to per-sample scoring, enabling applications such as best-of-$N$ selection and confidence-based filtering.
Score: 13.41454380481593
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting when large language models (LLMs) are uncertain is critical for building reliable systems, yet existing methods are overly complicated, relying on brittle semantic clustering or internal states. We introduce \textbf{Radial Dispersion Score (RDS)}, a simple, parameter-free, fully model-agnostic uncertainty metric that measures the radial dispersion of sampled generations in embedding space. A lightweight probability-weighted variant further incorporates the model's own token probabilities when available, outperforming different nine strong baselines. Moroever, RDS naturally extends to per-sample scoring, enabling applications such as best-of-$N$ selection and confidence-based filtering. Across four challenging free-form QA datasets and multiple LLMs, our metrics achieve state-of-the-art hallucination detection and answer selection performance, while remaining robust and scalable with respect to sample size and embedding choice.

Related papers

Sharp Convergence Rates for Masked Diffusion Models [53.117058231393834]
We develop a total-variation based analysis for the Euler method that overcomes limitations.<n>Our results relax assumptions on score estimation, improve parameter dependencies, and establish convergence guarantees.<n>Overall, our analysis introduces a direct TV-based error decomposition along the CTMC trajectory and a decoupling-based path-wise analysis for FHS.
arXiv Detail & Related papers (2026-02-26T00:47:51Z)
Large Language Models Are Bad Dice Players: LLMs Struggle to Generate Random Numbers from Statistical Distributions [50.1404916337174]
We present the first large-scale, statistically powered audit of native probabilistic sampling in large language models (LLMs)<n>We show that batch generation achieves only modest statistical validity, with a 13% median pass rate, while independent requests collapse almost entirely.<n>We conclude that current LLMs lack a functional internal sampler, necessitating the use of external tools for applications requiring statistical guarantees.
arXiv Detail & Related papers (2026-01-08T22:33:12Z)
Efficient semantic uncertainty quantification in language models via diversity-steered sampling [46.23327887393273]
We introduce a diversity-steered sampler that discourages semantically redundant outputs during decoding.<n>Key idea is to inject a continuous semantic-similarity penalty into the model's proposal distribution.<n>Being modular and requiring no gradient access to the base LLM, the framework promises to serve as a drop-in enhancement for uncertainty estimation.
arXiv Detail & Related papers (2025-10-24T10:06:21Z)
Model selection for stochastic dynamics: a parsimonious and principled approach [0.0]
This thesis focuses on the discovery of imperfect differential equations (SDEs) and partial differential equations (SPDEs) from noisy and discrete time series.<n>A major challenge is selecting the simplest possible correct model from vast libraries of candidate models.<n>We introduce PASTIS (Parsimonious Inference), a new information criterion derived from extreme value theory.
arXiv Detail & Related papers (2025-07-05T18:15:26Z)
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model [19.8785302359805]
This paper presents a theoretically motivated uncertainty quantification model based on a Bayesian entropy formulation.<n>We use this formulation to train USAM, a lightweight post-hoc UQ method.<n>Our proposed deterministic USAM demonstrates superior predictive capabilities on the SA-V, MOSE, ADE20k, DAVIS, and COCO datasets.
arXiv Detail & Related papers (2025-05-08T08:36:23Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Uncertainty-aware Language Modeling for Selective Question Answering [107.47864420630923]
We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs. Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems.
arXiv Detail & Related papers (2023-11-26T22:47:54Z)
The Decaying Missing-at-Random Framework: Model Doubly Robust Causal Inference with Partially Labeled Data [8.916614661563893]
We introduce a missing-at-random (decaying MAR) framework and associated approaches for doubly robust causal inference.<n>This simultaneously addresses selection bias in the labeling mechanism and the extreme imbalance between labeled and unlabeled groups.<n>To ensure robust causal conclusions, we propose a bias-reduced SS estimator for the average treatment effect.
arXiv Detail & Related papers (2023-05-22T07:37:12Z)
Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method. We develop practical bounds to apply it to language generation. We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z)
Learning Multivariate CDFs and Copulas using Tensor Factorization [39.24470798045442]
Learning the multivariate distribution of data is a core challenge in statistics and machine learning. In this work, we aim to learn multivariate cumulative distribution functions (CDFs), as they can handle mixed random variables. We show that any grid sampled version of a joint CDF of mixed random variables admits a universal representation as a naive Bayes model. We demonstrate the superior performance of the proposed model in several synthetic and real datasets and applications including regression, sampling and data imputation.
arXiv Detail & Related papers (2022-10-13T16:18:46Z)
SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples. We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.