Chain of Natural Language Inference for Reducing Large Language Model
Ungrounded Hallucinations
- URL: http://arxiv.org/abs/2310.03951v2
- Date: Mon, 9 Oct 2023 18:15:21 GMT
- Title: Chain of Natural Language Inference for Reducing Large Language Model
Ungrounded Hallucinations
- Authors: Deren Lei, Yaxi Li, Mengya Hu, Mingyu Wang, Vincent Yun, Emily Ching,
Eslam Kamal
- Abstract summary: Large language models (LLMs) can generate fluent natural language texts when given relevant documents as background context.
LLMs are prone to generate hallucinations that are not supported by the provided sources.
We propose a hierarchical framework to detect and mitigate such ungrounded hallucination.
- Score: 3.9566468090516067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) can generate fluent natural language texts when
given relevant documents as background context. This ability has attracted
considerable interest in developing industry applications of LLMs. However,
LLMs are prone to generate hallucinations that are not supported by the
provided sources. In this paper, we propose a hierarchical framework to detect
and mitigate such ungrounded hallucination. Our framework uses Chain of Natural
Language Inference (CoNLI) for hallucination detection and hallucination
reduction via post-editing. Our approach achieves state-of-the-art performance
on hallucination detection and enhances text quality through rewrite, using
LLMs without any fine-tuning or domain-specific prompt engineering. We show
that this simple plug-and-play framework can serve as an effective choice for
hallucination detection and reduction, achieving competitive performance across
various contexts.
Related papers
- Investigating the Role of Prompting and External Tools in Hallucination Rates of Large Language Models [0.0]
Large Language Models (LLMs) are powerful computational models trained on extensive corpora of human-readable text, enabling them to perform general-purpose language understanding and generation.
Despite these successes, LLMs often produce inaccuracies, commonly referred to as hallucinations.
This paper provides an empirical evaluation of different prompting strategies and frameworks aimed at reducing hallucinations in LLMs.
arXiv Detail & Related papers (2024-10-25T08:34:53Z) - ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries [29.561699707926056]
Large language models (LLMs) are prone to hallucination-outputs that stray from intended meanings.
We introduce a first-of-its-kind dataset with $sim$10K samples, curated specifically for hallucination detection in code summarization.
arXiv Detail & Related papers (2024-10-17T19:38:55Z) - Mitigating Multilingual Hallucination in Large Vision-Language Models [35.75851356840673]
We propose a two-stage Multilingual Hallucination Removal (MHR) framework for Large Vision-Language Models (LVLMs)
Instead of relying on the intricate manual annotations of multilingual resources, we propose a novel cross-lingual alignment method.
Our framework delivers an average increase of 19.0% in accuracy across 13 different languages.
arXiv Detail & Related papers (2024-08-01T13:34:35Z) - Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks.
They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences.
We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z) - Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback [48.065569871444275]
We propose detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) via fine-grained AI feedback.
We generate a small-size hallucination annotation dataset by proprietary models.
Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model.
arXiv Detail & Related papers (2024-04-22T14:46:10Z) - Hallucination Diversity-Aware Active Learning for Text Summarization [46.00645048690819]
Large Language Models (LLMs) have shown propensity to generate hallucinated outputs, i.e., texts that are factually incorrect or unsupported.
Existing methods for alleviating hallucinations typically require costly human annotations to identify and correct hallucinations in LLM outputs.
We propose the first active learning framework to alleviate LLM hallucinations, reducing costly human annotations of hallucination needed.
arXiv Detail & Related papers (2024-04-02T02:30:27Z) - Alleviating Hallucinations of Large Language Models through Induced
Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information.
We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z) - HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large
Language Models [146.87696738011712]
Large language models (LLMs) are prone to generate hallucinations, i.e., content that conflicts with the source or cannot be verified by the factual knowledge.
To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation benchmark for Large Language Models (HaluEval)
arXiv Detail & Related papers (2023-05-19T15:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.