Bolster Hallucination Detection via Prompt-Guided Data Augmentation
- URL: http://arxiv.org/abs/2510.15977v1
- Date: Mon, 13 Oct 2025 02:06:15 GMT
- Title: Bolster Hallucination Detection via Prompt-Guided Data Augmentation
- Authors: Wenyun Li, Zheng Zhang, Dongmei Jiang, Xiangyuan Lan,
- Abstract summary: We introduce Prompt-guided data Augmented haLlucination dEtection (PALE) as data augmentation for hallucination detection.<n>This framework can generate both truthful and hallucinated data under prompt guidance at a relatively low cost.<n>In experiments, PALE achieves superior hallucination detection performance, outperforming the competitive baseline by a significant margin of 6.55%.
- Score: 33.98592618879001
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Large language models (LLMs) have garnered significant interest in AI community. Despite their impressive generation capabilities, they have been found to produce misleading or fabricated information, a phenomenon known as hallucinations. Consequently, hallucination detection has become critical to ensure the reliability of LLM-generated content. One primary challenge in hallucination detection is the scarcity of well-labeled datasets containing both truthful and hallucinated outputs. To address this issue, we introduce Prompt-guided data Augmented haLlucination dEtection (PALE), a novel framework that leverages prompt-guided responses from LLMs as data augmentation for hallucination detection. This strategy can generate both truthful and hallucinated data under prompt guidance at a relatively low cost. To more effectively evaluate the truthfulness of the sparse intermediate embeddings produced by LLMs, we introduce an estimation metric called the Contrastive Mahalanobis Score (CM Score). This score is based on modeling the distributions of truthful and hallucinated data in the activation space. CM Score employs a matrix decomposition approach to more accurately capture the underlying structure of these distributions. Importantly, our framework does not require additional human annotations, offering strong generalizability and practicality for real-world applications. Extensive experiments demonstrate that PALE achieves superior hallucination detection performance, outperforming the competitive baseline by a significant margin of 6.55%.
Related papers
- HalluMat: Detecting Hallucinations in LLM-Generated Materials Science Content Through Multi-Stage Verification [0.9490124006642771]
HalluMatData is a benchmark dataset for evaluating hallucination detection methods.<n>HalluMatDetector is a multi-stage hallucination detection framework.<n>HalluMatDetector reduces hallucination verification rates by 30%.
arXiv Detail & Related papers (2025-12-26T22:16:12Z) - Reducing Hallucinations in Summarization via Reinforcement Learning with Entity Hallucination Index [2.2427832125073737]
We introduce a rewarddriven fine-tuning framework to optimize for Entity Hallucination Index (EHI)<n>EHI is a metric designed to quantify the presence, correctness, and grounding of named entities in generated summaries.<n>Our approach does not rely on human-written factuality annotations, enabling scalable fine-tuning.
arXiv Detail & Related papers (2025-07-30T15:00:00Z) - HalluLens: LLM Hallucination Benchmark [49.170128733508335]
Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as "hallucination"<n>This paper introduces a comprehensive hallucination benchmark, incorporating both new extrinsic and existing intrinsic evaluation tasks.
arXiv Detail & Related papers (2025-04-24T13:40:27Z) - REFIND at SemEval-2025 Task 3: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models [15.380441563675243]
REFIND (Retrieval-augmented Factuality hallucINation Detection) is a novel framework that detects hallucinated spans within large language model (LLM) outputs.<n>We propose the Context Sensitivity Ratio (CSR), a novel metric that quantifies the sensitivity of LLM outputs to retrieved evidence.<n> REFIND demonstrated robustness across nine languages, including low-resource settings, and significantly outperformed baseline models.
arXiv Detail & Related papers (2025-02-19T10:59:05Z) - HaloScope: Harnessing Unlabeled LLM Generations for Hallucination
Detection [55.596406899347926]
HaloScope is a novel learning framework that leverages unlabeled large language models in the wild for hallucination detection.
We present an automated membership estimation score for distinguishing between truthful and untruthful generations within unlabeled mixture data.
Experiments show that HaloScope can achieve superior hallucination detection performance, outperforming the competitive rivals by a significant margin.
arXiv Detail & Related papers (2024-09-26T03:22:09Z) - Mitigating Entity-Level Hallucination in Large Language Models [11.872916697604278]
This paper proposes Dynamic Retrieval Augmentation based on hallucination Detection (DRAD) as a novel method to detect and mitigate hallucinations in Large Language Models (LLMs)
Experiment results show that DRAD demonstrates superior performance in both detecting and mitigating hallucinations in LLMs.
arXiv Detail & Related papers (2024-07-12T16:47:34Z) - HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination
Tendency of LLMs [0.0]
Hallucinations pose a significant challenge to the reliability and alignment of Large Language Models (LLMs)
This paper introduces an automated scalable framework that combines benchmarking LLMs' hallucination tendencies with efficient hallucination detection.
The framework is domain-agnostic, allowing the use of any language model for benchmark creation or evaluation in any domain.
arXiv Detail & Related papers (2024-02-25T22:23:37Z) - Rowen: Adaptive Retrieval-Augmented Generation for Hallucination Mitigation in LLMs [88.75700174889538]
Hallucinations present a significant challenge for large language models (LLMs)<n>The utilization of parametric knowledge in generating factual content is constrained by the limited knowledge of LLMs.<n>We present Rowen, a novel framework that enhances LLMs with an adaptive retrieval augmentation process tailored to address hallucinated outputs.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus [99.33091772494751]
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields.
LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations.
We propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs.
arXiv Detail & Related papers (2023-11-22T08:39:17Z) - FactCHD: Benchmarking Fact-Conflicting Hallucination Detection [64.4610684475899]
FactCHD is a benchmark designed for the detection of fact-conflicting hallucinations from LLMs.
FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation.
We introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2.
arXiv Detail & Related papers (2023-10-18T16:27:49Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.