Related papers: LLM Hallucination Detection: HSAD

LLM Hallucination Detection: HSAD

URL: http://arxiv.org/abs/2509.23580v2
Date: Wed, 08 Oct 2025 03:34:42 GMT
Title: LLM Hallucination Detection: HSAD
Authors: JinXin Li, Gang Tu, JunJie Hu,
Abstract summary: hallucination detection methods rely on factual consistency verification or static hidden layer features.<n>This paper proposes a hallucination detection method based on the frequency-domain analysis of hidden layer temporal signals.<n>The method overcomes the limitations of existing approaches in terms of knowledge coverage and the detection of reasoning biases.
Score: 6.306213519424463
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although Large Language Models have demonstrated powerful capabilities in a wide range of tasks such as language understanding and code generation, the frequent occurrence of hallucinations during the generation process has become a significant impediment to their deployment in critical application scenarios. Current mainstream hallucination detection methods rely on factual consistency verification or static hidden layer features. The former is constrained by the scope of knowledge coverage, while the latter struggles to capture reasoning biases during the inference process. To address these issues, and inspired by signal analysis methods in cognitive neuroscience, this paper proposes a hallucination detection method based on the frequency-domain analysis of hidden layer temporal signals, named HSAD (\textbf{H}idden \textbf{S}ignal \textbf{A}nalysis-based \textbf{D}etection). First, by treating the LLM's reasoning process as a cognitive journey that unfolds over time, we propose modeling and simulating the human process of signal perception and discrimination in a deception-detection scenario through hidden layer temporal signals. Next, The Fast Fourier Transform is applied to map these temporal signals into the frequency domain to construct spectral features, which are used to capture anomalies that arise during the reasoning process; analysis experiments on these spectral features have proven the effectiveness of this approach. Finally, a hallucination detection algorithm is designed based on these spectral features to identify hallucinations in the generated content. By effectively combining the modeling of the reasoning process with frequency-domain feature extraction, the HSAD method overcomes the limitations of existing approaches in terms of knowledge coverage and the detection of reasoning biases, demonstrating higher detection accuracy and robustness.

Related papers

Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention [27.49425252327799]
We introduce a frequency-aware perspective on attention by analyzing its variation during generation.<n>We develop a lightweight hallucination detector using high-frequency attention features.
arXiv Detail & Related papers (2026-02-20T11:18:45Z)
Hallucination Begins Where Saliency Drops [18.189047289404325]
hallucinations frequently arise when preceding output tokens exhibit low saliency toward the prediction of the next token.<n>We introduce LVLMs-Saliency, a gradient-aware diagnostic framework that quantifies the visual grounding strength of each output token.<n>Our method significantly reduces hallucination rates while preserving fluency and task performance, offering a robust and interpretable solution.
arXiv Detail & Related papers (2026-01-28T05:50:52Z)
FaithSCAN: Model-Driven Single-Pass Hallucination Detection for Faithful Visual Question Answering [14.550872089352943]
FaithSCAN is a lightweight network that detects hallucinations by exploiting rich internal signals of vision-language models.<n>We extend the LLM-as-a-Judge paradigm to VQA hallucination and propose a low-cost strategy to automatically generate model-dependent supervision signals.<n>In-depth analysis shows hallucinations arise from systematic internal state variations in visual perception, cross-modal reasoning, and language decoding.
arXiv Detail & Related papers (2026-01-01T09:19:39Z)
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models [49.83690850047884]
hallucination problem in D-LLMs remains underexplored, limiting their reliability in real-world applications.<n>Existing hallucination detection methods are designed for AR-LLMs and rely on signals from single-step generation.<n>We propose TraceDet, a novel framework that explicitly leverages the intermediate denoising steps of D-LLMs for hallucination detection.
arXiv Detail & Related papers (2025-09-30T02:01:10Z)
LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals [19.38878193608028]
Retrieval-Augmented Generation (RAG) aims to mitigate hallucinations in large language models (LLMs) by grounding responses in retrieved documents.<n>Yet, RAG-based LLMs still hallucinate even when provided with correct and sufficient context.<n>We propose LUMINA, a novel framework that detects hallucinations in RAG systems through context-knowledge signals.
arXiv Detail & Related papers (2025-09-26T04:57:46Z)
A Novel Differential Feature Learning for Effective Hallucination Detection and Classification [3.9060143123877844]
We propose a dual-model architecture integrating a Projected Fusion block for adaptive inter-layer feature weighting and a Differential Feature Learning mechanism.<n>We demonstrate that hallucination signals concentrate in highly sparse feature subsets, achieving significant accuracy improvements on question answering and dialogue tasks.
arXiv Detail & Related papers (2025-09-20T06:48:22Z)
LLM Hallucination Detection: A Fast Fourier Transform Method Based on Hidden Layer Temporal Signals [10.85580316542761]
Hallucination remains a critical barrier for deploying large language models (LLMs) in reliability-sensitive applications.<n>We propose HSAD (Hidden Signal Analysis-based Detection), a novel hallucination detection framework that models the temporal dynamics of hidden representations.<n>Across multiple benchmarks, including TruthfulQA, HSAD achieves over 10 percentage points improvement compared to prior state-of-the-art methods.
arXiv Detail & Related papers (2025-09-16T15:08:19Z)
Beyond ROUGE: N-Gram Subspace Features for LLM Hallucination Detection [5.0106565473767075]
Large Language Models (LLMs) have demonstrated effectiveness across a wide variety of tasks involving natural language.<n>A fundamental problem of hallucinations still plagues these models, limiting their trustworthiness in generating consistent, truthful information.<n>We propose a novel approach inspired by ROUGE that constructs an N-Gram frequency tensor from LLM-generated text.<n>This tensor captures richer semantic structure by encoding co-occurrence patterns, enabling better differentiation between factual and hallucinated content.
arXiv Detail & Related papers (2025-09-03T18:52:24Z)
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs [50.18087419133284]
hallucination detection methods leveraging hidden states predominantly focus on static and isolated representations.<n>We introduce a novel metric, the ICR Score, which quantifies the contribution of modules to the hidden states' update.<n>We propose a hallucination detection method, the ICR Probe, which captures the cross-layer evolution of hidden states.
arXiv Detail & Related papers (2025-07-22T11:44:26Z)
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation [78.78421340836915]
We systematically investigate reference-free hallucination detection in open-domain long-form responses.<n>Our findings reveal that internal states are insufficient for reliably distinguishing between factual and hallucinated content.<n>We introduce a new paradigm, named RATE-FT, that augments fine-tuning with an auxiliary task for the model to jointly learn with the main task of hallucination detection.
arXiv Detail & Related papers (2025-05-18T07:10:03Z)
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations [82.42811602081692]
This paper introduces a subsequence association framework to systematically trace and understand hallucinations.<n>Key insight is hallucinations that arise when dominant hallucinatory associations outweigh faithful ones.<n>We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts.
arXiv Detail & Related papers (2025-04-17T06:34:45Z)
From Vision to Sound: Advancing Audio Anomaly Detection with Vision-Based Algorithms [6.643376250301589]
We investigate the adaptation of such algorithms to the audio domain to address the problem of Audio Anomaly Detection (AAD)<n>Unlike most existing AAD methods, which primarily classify anomalous samples, our approach introduces fine-grained temporal-frequency localization of anomalies within the spectrogram.<n>We evaluate our approach on industrial and environmental benchmarks, demonstrating the effectiveness of VAD techniques in detecting anomalies in audio signals.
arXiv Detail & Related papers (2025-02-25T16:22:42Z)
A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection [63.56136319976554]
Large Language Models (LLMs) generate hallucinations, which can cause significant damage when deployed for mission-critical tasks. We propose a self-check approach based on reverse validation to detect factual errors automatically in a zero-resource fashion. We empirically evaluate our method and existing zero-resource detection methods on two datasets.
arXiv Detail & Related papers (2023-10-10T10:14:59Z)
DriPP: Driven Point Processes to Model Stimuli Induced Patterns in M/EEG Signals [62.997667081978825]
We develop a novel statistical point process model-called driven temporal point processes (DriPP) We derive a fast and principled expectation-maximization (EM) algorithm to estimate the parameters of this model. Results on standard MEG datasets demonstrate that our methodology reveals event-related neural responses.
arXiv Detail & Related papers (2021-12-08T13:07:21Z)
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics [85.52408288789164]
Real-world applications of reinforcement learning (RL) require the agent to deal with high-dimensional observations such as those generated from a megapixel camera. Prior work has addressed such problems with representation learning, through which the agent can provably extract endogenous, latent state information from raw observations. However, such approaches can fail in the presence of temporally correlated noise in the observations.
arXiv Detail & Related papers (2021-10-17T15:21:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.