HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses
- URL: http://arxiv.org/abs/2502.08109v1
- Date: Wed, 12 Feb 2025 04:17:02 GMT
- Title: HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses
- Authors: Sujeong Lee, Hayoung Lee, Seongsoo Heo, Wonik Choi,
- Abstract summary: This paper proposes an explanation enhanced hallucination-detection model, coined as HuDEx.
The proposed model provides a novel approach to integrate detection with explanations, and enable both users and the LLM itself to understand and reduce errors.
- Score: 0.12499537119440242
- License:
- Abstract: Recent advances in large language models (LLMs) have shown promising improvements, often surpassing existing methods across a wide range of downstream tasks in natural language processing. However, these models still face challenges, which may hinder their practical applicability. For example, the phenomenon of hallucination is known to compromise the reliability of LLMs, especially in fields that demand high factual precision. Current benchmarks primarily focus on hallucination detection and factuality evaluation but do not extend beyond identification. This paper proposes an explanation enhanced hallucination-detection model, coined as HuDEx, aimed at enhancing the reliability of LLM-generated responses by both detecting hallucinations and providing detailed explanations. The proposed model provides a novel approach to integrate detection with explanations, and enable both users and the LLM itself to understand and reduce errors. Our measurement results demonstrate that the proposed model surpasses larger LLMs, such as Llama3 70B and GPT-4, in hallucination detection accuracy, while maintaining reliable explanations. Furthermore, the proposed model performs well in both zero-shot and other test environments, showcasing its adaptability across diverse benchmark datasets. The proposed approach further enhances the hallucination detection research by introducing a novel approach to integrating interpretability with hallucination detection, which further enhances the performance and reliability of evaluating hallucinations in language models.
Related papers
- REFIND: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models [15.380441563675243]
Hallucinations in large language model (LLM) outputs severely limit their reliability in knowledge-intensive tasks such as question answering.
We introduce REFIND, a novel framework that detects hallucinated spans within LLM outputs by directly leveraging retrieved documents.
REFIND demonstrated robustness across nine languages, including low-resource settings, and significantly outperformed baseline models.
arXiv Detail & Related papers (2025-02-19T10:59:05Z) - A Debate-Driven Experiment on LLM Hallucinations and Accuracy [7.821303946741665]
This study investigates the phenomenon of hallucination in large language models (LLMs)
Multiple instances of GPT-4o-Mini models engage in a debate-like interaction prompted with questions from the TruthfulQA dataset.
One model is deliberately instructed to generate plausible but false answers while the other models are asked to respond truthfully.
arXiv Detail & Related papers (2024-10-25T11:41:27Z) - Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning [16.883679810267342]
Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination.
This paper introduces a novel approach called Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination.
arXiv Detail & Related papers (2024-10-16T00:15:40Z) - SLM Meets LLM: Balancing Latency, Interpretability and Consistency in Hallucination Detection [10.54378596443678]
Large language models (LLMs) are highly capable but face latency challenges in real-time applications.
This study optimize the real-time interpretable hallucination detection by introducing effective prompting techniques.
arXiv Detail & Related papers (2024-08-22T22:13:13Z) - PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics [51.17512229589]
PoLLMgraph is a model-based white-box detection and forecasting approach for large language models.
We show that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics.
Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors.
arXiv Detail & Related papers (2024-04-06T20:02:20Z) - IBD: Alleviating Hallucinations in Large Vision-Language Models via
Image-Biased Decoding [37.16880672402059]
Over-reliance on linguistic priors has been identified as a key factor leading to hallucinations.
We propose to alleviate this problem by introducing a novel image-biased decoding technique.
Our method derives the next-token probability distribution by contrasting predictions from a conventional LVLM with those of an image-biased LVLM.
arXiv Detail & Related papers (2024-02-28T16:57:22Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - Improving Factual Consistency of News Summarization by Contrastive Preference Optimization [65.11227166319546]
Large language models (LLMs) generate summaries that are factually inconsistent with original articles.
These hallucinations are challenging to detect through traditional methods.
We propose Contrastive Preference Optimization (CPO) to disentangle the LLMs' propensities to generate faithful and fake content.
arXiv Detail & Related papers (2023-10-30T08:40:16Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z) - A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of
LLMs by Validating Low-Confidence Generation [76.34411067299331]
Large language models often tend to 'hallucinate' which critically hampers their reliability.
We propose an approach that actively detects and mitigates hallucinations during the generation process.
We show that the proposed active detection and mitigation approach successfully reduces the hallucinations of the GPT-3.5 model from 47.5% to 14.5% on average.
arXiv Detail & Related papers (2023-07-08T14:25:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.