Hallucinated Span Detection with Multi-View Attention Features
- URL: http://arxiv.org/abs/2504.04335v2
- Date: Mon, 15 Sep 2025 04:21:37 GMT
- Title: Hallucinated Span Detection with Multi-View Attention Features
- Authors: Yuya Ogasa, Yuki Arase,
- Abstract summary: This study addresses the problem of hallucinated span detection in the outputs of large language models.<n>It has received less attention than output-level hallucination detection despite its practical importance.
- Score: 8.747292152322578
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study addresses the problem of hallucinated span detection in the outputs of large language models. It has received less attention than output-level hallucination detection despite its practical importance. Prior work has shown that attentions often exhibit irregular patterns when hallucinations occur. Motivated by these findings, we extract features from the attention matrix that provide complementary views capturing (a) whether certain tokens are influential or ignored, (b) whether attention is biased toward specific subsets, and (c) whether a token is generated referring to a narrow or broad context, in the generation. These features are input to a Transformer-based classifier to conduct sequential labelling to identify hallucinated spans. Experimental results indicate that the proposed method outperforms strong baselines on hallucinated span detection with longer input contexts, such as data-to-text and summarisation tasks.
Related papers
- Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention [27.49425252327799]
We introduce a frequency-aware perspective on attention by analyzing its variation during generation.<n>We develop a lightweight hallucination detector using high-frequency attention features.
arXiv Detail & Related papers (2026-02-20T11:18:45Z) - The Map of Misbelief: Tracing Intrinsic and Extrinsic Hallucinations Through Attention Patterns [1.0896567381206717]
Large Language Models (LLMs) are increasingly deployed in safety-critical domains, yet remain susceptible to hallucinations.<n>We introduce a principled evaluation framework that differentiates between extrinsic and intrinsic hallucination categories.<n>We propose novel attention aggregation strategies that improve both interpretability and hallucination detection performance.
arXiv Detail & Related papers (2025-11-13T22:42:18Z) - Neural Message-Passing on Attention Graphs for Hallucination Detection [32.29963721910821]
CHARM casts hallucination detection as a graph learning task and tackles it by applying GNNs over the above attributed graphs.<n>We show that CHARM provably subsumes prior attention-based traces and, experimentally, it consistently outperforms other approaches across diverse benchmarks.
arXiv Detail & Related papers (2025-09-29T13:37:12Z) - Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection [49.26064449816502]
We propose a Gradient-based Influence-Aware Constrained Decoding (GACD) method to address text-visual bias and co-occurrence bias.<n>GACD effectively reduces hallucinations and improves the visual grounding of MLLM outputs.
arXiv Detail & Related papers (2025-09-03T08:13:52Z) - A Survey of Multimodal Hallucination Evaluation and Detection [52.03164192840023]
Multi-modal Large Language Models (MLLMs) have emerged as a powerful paradigm for integrating visual and textual information.<n>These models often suffer from hallucination, producing content that appears plausible but contradicts the input content or established world knowledge.<n>This survey offers an in-depth review of hallucination evaluation benchmarks and detection methods across Image-to-Text (I2T) and Text-to-image (T2I) generation tasks.
arXiv Detail & Related papers (2025-07-25T07:22:42Z) - CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models [60.0300765815417]
Large Vision-Language Models (LVLMs) frequently produce content that deviates from visual information, leading to object hallucination.<n>We propose Caption-sensitive Attention Intervention (CAI), a training-free, plug-and-play hallucination mitigation method.
arXiv Detail & Related papers (2025-06-30T07:52:36Z) - Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations [82.42811602081692]
This paper introduces a subsequence association framework to systematically trace and understand hallucinations.<n>Key insight is hallucinations that arise when dominant hallucinatory associations outweigh faithful ones.<n>We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts.
arXiv Detail & Related papers (2025-04-17T06:34:45Z) - Hallucination Detection in LLMs with Topological Divergence on Attention Graphs [64.74977204942199]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.<n>We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z) - Robust Hallucination Detection in LLMs via Adaptive Token Selection [25.21763722332831]
Hallucinations in large language models (LLMs) pose significant safety concerns that impede their broader deployment.
We propose HaMI, a novel approach that enables robust detection of hallucinations through adaptive selection and learning of critical tokens.
We achieve this robustness by an innovative formulation of the Hallucination detection task as Multiple Instance (HaMI) learning over token-level representations within a sequence.
arXiv Detail & Related papers (2025-04-10T15:39:10Z) - "Principal Components" Enable A New Language of Images [79.45806370905775]
We introduce a novel visual tokenization framework that embeds a provable PCA-like structure into the latent token space.
Our approach achieves state-of-the-art reconstruction performance and enables better interpretability to align with the human vision system.
arXiv Detail & Related papers (2025-03-11T17:59:41Z) - Hallucination Detection in LLMs Using Spectral Features of Attention Maps [8.820670807424174]
Large Language Models (LLMs) have demonstrated remarkable performance across various tasks but remain prone to hallucinations.<n>Recent methods leverage attention map properties to this end, though their effectiveness remains limited.<n>We propose the $textLapEigvals$ method, which uses the top-$k$ eigenvalues of the Laplacian matrix derived from the attention maps as an input to hallucination detection probes.
arXiv Detail & Related papers (2025-02-24T19:30:24Z) - KNN Transformer with Pyramid Prompts for Few-Shot Learning [52.735070934075736]
Few-Shot Learning aims to recognize new classes with limited labeled data.
Recent studies have attempted to address the challenge of rare samples with textual prompts to modulate visual features.
arXiv Detail & Related papers (2024-10-14T07:39:30Z) - Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps [48.58310785625051]
Large language models (LLMs) can hallucinate details and respond with unsubstantiated answers.
This paper describes a simple approach for detecting such contextual hallucinations.
arXiv Detail & Related papers (2024-07-09T17:44:34Z) - Elliptical Attention [1.7597562616011944]
Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-the-art performance across a variety of applications in language and vision.
We propose using a Mahalanobis distance metric for computing the attention weights to stretch the underlying feature space in directions of high contextual relevance.
arXiv Detail & Related papers (2024-06-19T18:38:11Z) - Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus [99.33091772494751]
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields.
LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations.
We propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs.
arXiv Detail & Related papers (2023-11-22T08:39:17Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Bridging the Gap: Gaze Events as Interpretable Concepts to Explain Deep
Neural Sequence Models [0.7829352305480283]
In this work, we employ established gaze event detection algorithms for fixations and saccades.
We quantitatively evaluate the impact of these events by determining their concept influence.
arXiv Detail & Related papers (2023-04-12T10:15:31Z) - Desiderata for Representation Learning: A Causal Perspective [104.3711759578494]
We take a causal perspective on representation learning, formalizing non-spuriousness and efficiency (in supervised representation learning) and disentanglement (in unsupervised representation learning)
This yields computable metrics that can be used to assess the degree to which representations satisfy the desiderata of interest and learn non-spurious and disentangled representations from single observational datasets.
arXiv Detail & Related papers (2021-09-08T17:33:54Z) - Effective Attention Sheds Light On Interpretability [3.317258557707008]
We ask whether visualizing effective attention gives different conclusions than interpretation of standard attention.
We show that effective attention is less associated with the features related to the language modeling pretraining.
We recommend using effective attention for studying a transformer's behavior since it is more pertinent to the model output by design.
arXiv Detail & Related papers (2021-05-18T23:41:26Z) - Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input)
We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.