Related papers: Lowest Span Confidence: A Zero-Shot Metric for Efficient and Black-Box Hallucination Detection in LLMs

Lowest Span Confidence: A Zero-Shot Metric for Efficient and Black-Box Hallucination Detection in LLMs

URL: http://arxiv.org/abs/2601.19918v1
Date: Wed, 07 Jan 2026 12:48:33 GMT
Title: Lowest Span Confidence: A Zero-Shot Metric for Efficient and Black-Box Hallucination Detection in LLMs
Authors: Yitong Qiao, Licheng Pan, Yu Mi, Lei Liu, Yue Shen, Fei Sun, Zhixuan Chu,
Abstract summary: Hallucinations in Large Language Models (LLMs) generate plausible but non-factual content.<n>We propose a novel efficient zero-shot metric called Lowest Span Confidence (LSC) for hallucination detection under minimal resource assumptions.<n>LSC consistently outperforms existing zero-shot baselines, delivering strong detection performance even under resource-constrained conditions.
Score: 24.471653720056803
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hallucinations in Large Language Models (LLMs), i.e., the tendency to generate plausible but non-factual content, pose a significant challenge for their reliable deployment in high-stakes environments. However, existing hallucination detection methods generally operate under unrealistic assumptions, i.e., either requiring expensive intensive sampling strategies for consistency checks or white-box LLM states, which are unavailable or inefficient in common API-based scenarios. To this end, we propose a novel efficient zero-shot metric called Lowest Span Confidence (LSC) for hallucination detection under minimal resource assumptions, only requiring a single forward with output probabilities. Concretely, LSC evaluates the joint likelihood of semantically coherent spans via a sliding window mechanism. By identifying regions of lowest marginal confidence across variable-length n-grams, LSC could well capture local uncertainty patterns strongly correlated with factual inconsistency. Importantly, LSC can mitigate the dilution effect of perplexity and the noise sensitivity of minimum token probability, offering a more robust estimate of factual uncertainty. Extensive experiments across multiple state-of-the-art (SOTA) LLMs and diverse benchmarks show that LSC consistently outperforms existing zero-shot baselines, delivering strong detection performance even under resource-constrained conditions.

Related papers

DRIFT: Detecting Representational Inconsistencies for Factual Truthfulness [5.785021425715989]
LLMs often produce fluent but incorrect answers, yet detecting such hallucinations typically requires multiple sampling passes or post-hoc verification.<n>We propose a lightweight probe to read these signals directly from hidden states.<n>We develop an LLM router that answers confident queries immediately while delegating uncertain ones to stronger models.
arXiv Detail & Related papers (2026-01-20T18:16:10Z)
Hallucination Detection and Evaluation of Large Language Model [0.26856688022781555]
Hallucinations in Large Language Models (LLMs) pose a significant challenge, generating misleading or unverifiable content.<n>Existing evaluation methods, such as KnowHalu, employ multi-stage verification but suffer from high computational costs.<n>To address this, we integrate the Hughes Hallucination Evaluation Model (HHEM), a lightweight classification-based framework.
arXiv Detail & Related papers (2025-12-27T00:17:03Z)
Cross-Layer Attention Probing for Fine-Grained Hallucination Detection [6.83291363146574]
We propose Cross-Layer Attention Probing (CLAP), a novel activation probing technique for hallucination detection.<n>Our empirical evaluations show that CLAP improves hallucination detection compared to baselines on both decoded responses and responses sampled at higher temperatures.<n>CLAP maintains high reliability even when applied out-of-distribution.
arXiv Detail & Related papers (2025-09-04T14:37:34Z)
Semantic Energy: Detecting LLM Hallucination Beyond Entropy [106.92072182161712]
Large Language Models (LLMs) are being increasingly deployed in real-world applications, but they remain susceptible to hallucinations.<n>Uncertainty estimation is a feasible approach to detect such hallucinations.<n>We introduce Semantic Energy, a novel uncertainty estimation framework.
arXiv Detail & Related papers (2025-08-20T07:33:50Z)
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs [50.18087419133284]
hallucination detection methods leveraging hidden states predominantly focus on static and isolated representations.<n>We introduce a novel metric, the ICR Score, which quantifies the contribution of modules to the hidden states' update.<n>We propose a hallucination detection method, the ICR Probe, which captures the cross-layer evolution of hidden states.
arXiv Detail & Related papers (2025-07-22T11:44:26Z)
Cleanse: Uncertainty Estimation Approach Using Clustering-based Semantic Consistency in LLMs [5.161416961439468]
This study proposes an effective uncertainty estimation approach, textbfClusttextbfering-based semtextbfantic contextbfsisttextbfency (textbfCleanse)<n>The effectiveness of Cleanse for detecting hallucination is validated using four off-the-shelf models, LLaMA-7B, LLaMA-13B, LLaMA2-7B and Mistral-7B.
arXiv Detail & Related papers (2025-07-19T14:48:24Z)
Mitigating Spurious Correlations in LLMs via Causality-Aware Post-Training [57.03005244917803]
Large language models (LLMs) often fail on out-of-distribution (OOD) samples due to spurious correlations acquired during pre-training.<n>Here, we aim to mitigate such spurious correlations through causality-aware post-training (CAPT)<n> Experiments on the formal causal inference benchmark CLadder and the logical reasoning dataset PrOntoQA show that 3B-scale language models fine-tuned with CAPT can outperform both traditional SFT and larger LLMs on in-distribution (ID) and OOD tasks.
arXiv Detail & Related papers (2025-06-11T06:30:28Z)
SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs [2.805517909463769]
Large language models (LLMs) are increasingly deployed across diverse domains, yet they are prone to generating factually incorrect outputs.<n>We introduce a novel and scalable uncertainty-based semantic clustering framework for automated hallucination detection.
arXiv Detail & Related papers (2025-03-07T23:25:19Z)
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important. We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z)
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection [39.52923659121416]
We propose to explore the dense semantic information retained within textbfINternal textbfStates for halluctextbfInation textbfDEtection. A simple yet effective textbfEigenScore metric is proposed to better evaluate responses' self-consistency. A test time feature clipping approach is explored to truncate extreme activations in the internal states.
arXiv Detail & Related papers (2024-02-06T06:23:12Z)
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus [99.33091772494751]
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields. LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations. We propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs.
arXiv Detail & Related papers (2023-11-22T08:39:17Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.