Hallucination Detection and Hallucination Mitigation: An Investigation
- URL: http://arxiv.org/abs/2401.08358v1
- Date: Tue, 16 Jan 2024 13:36:07 GMT
- Title: Hallucination Detection and Hallucination Mitigation: An Investigation
- Authors: Junliang Luo, Tianyu Li, Di Wu, Michael Jenkin, Steve Liu, Gregory
Dudek
- Abstract summary: Large language models (LLMs) have achieved remarkable successes over the last two years in a range of different applications.
This report aims to present a comprehensive review of the current literature on both hallucination detection and hallucination mitigation.
- Score: 13.941799495842776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs), including ChatGPT, Bard, and Llama, have
achieved remarkable successes over the last two years in a range of different
applications. In spite of these successes, there exist concerns that limit the
wide application of LLMs. A key problem is the problem of hallucination.
Hallucination refers to the fact that in addition to correct responses, LLMs
can also generate seemingly correct but factually incorrect responses. This
report aims to present a comprehensive review of the current literature on both
hallucination detection and hallucination mitigation. We hope that this report
can serve as a good reference for both engineers and researchers who are
interested in LLMs and applying them to real world tasks.
Related papers
- A Survey of Hallucination in Large Visual Language Models [48.794850395309076]
The existence of hallucinations has limited the potential and practical effectiveness of LVLM in various fields.
The structure of LVLMs and main causes of hallucination generation are introduced.
The available hallucination evaluation benchmarks for LVLMs are presented.
arXiv Detail & Related papers (2024-10-20T10:58:58Z) - MedHalu: Hallucinations in Responses to Healthcare Queries by Large Language Models [26.464489158584463]
We conduct a pioneering study of hallucinations in LLM-generated responses to real-world healthcare queries from patients.
We propose MedHalu, a carefully crafted first-of-its-kind medical hallucination dataset with a diverse range of health-related topics.
We also introduce MedHaluDetect framework to evaluate capabilities of various LLMs in detecting hallucinations.
arXiv Detail & Related papers (2024-09-29T00:09:01Z) - Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks.
They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences.
We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z) - Hallucination is Inevitable: An Innate Limitation of Large Language
Models [3.8711997449980844]
We show that it is impossible to eliminate hallucination in large language models.
Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs.
arXiv Detail & Related papers (2024-01-22T10:26:14Z) - The Dawn After the Dark: An Empirical Study on Factuality Hallucination
in Large Language Models [134.6697160940223]
hallucination poses great challenge to trustworthy and reliable deployment of large language models.
Three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigate them.
This work presents a systematic empirical study on LLM hallucination, focused on the the three aspects of hallucination detection, source and mitigation.
arXiv Detail & Related papers (2024-01-06T12:40:45Z) - A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions [40.79317187623401]
The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP)
LLMs are prone to hallucination, generating plausible yet nonfactual content.
This phenomenon raises significant concerns over the reliability of LLMs in real-world information retrieval systems.
arXiv Detail & Related papers (2023-11-09T09:25:37Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z) - Evaluation and Analysis of Hallucination in Large Vision-Language Models [49.19829480199372]
Large Vision-Language Models (LVLMs) have recently achieved remarkable success.
LVLMs are still plagued by the hallucination problem.
Hallucination refers to the information of LVLMs' responses that does not exist in the visual input.
arXiv Detail & Related papers (2023-08-29T08:51:24Z) - HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large
Language Models [146.87696738011712]
Large language models (LLMs) are prone to generate hallucinations, i.e., content that conflicts with the source or cannot be verified by the factual knowledge.
To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation benchmark for Large Language Models (HaluEval)
arXiv Detail & Related papers (2023-05-19T15:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.