Quantifying and Attributing the Hallucination of Large Language Models
via Association Analysis
- URL: http://arxiv.org/abs/2309.05217v1
- Date: Mon, 11 Sep 2023 03:35:00 GMT
- Title: Quantifying and Attributing the Hallucination of Large Language Models
via Association Analysis
- Authors: Li Du, Yequan Wang, Xingrun Xing, Yiqun Ya, Xiang Li, Xin Jiang,
Xuezhi Fang
- Abstract summary: Large language models (LLMs) suffer from the hallucination problem, which threatens their reliability.
Previous works first categorize the hallucination according to the phenomenon similarity, then quantify the proportion that model outputs contain hallucinatory contents.
We combine the hallucination level and hallucination reason investigation through an association analysis, which builds the relationship between the hallucination rate of LLMs with a set of risk factors.
- Score: 29.043008337391075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although demonstrating superb performance on various NLP tasks, large
language models (LLMs) still suffer from the hallucination problem, which
threatens the reliability of LLMs. To measure the level of hallucination of
LLMs, previous works first categorize the hallucination according to the
phenomenon similarity, then quantify the proportion that model outputs contain
hallucinatory contents. However, such hallucination rates could easily be
distorted by confounders. Moreover, such hallucination rates could not reflect
the reasons for the hallucination, as similar hallucinatory phenomena may
originate from different sources. To address these issues, we propose to
combine the hallucination level quantification and hallucination reason
investigation through an association analysis, which builds the relationship
between the hallucination rate of LLMs with a set of risk factors. In this way,
we are able to observe the hallucination level under each value of each risk
factor, examining the contribution and statistical significance of each risk
factor, meanwhile excluding the confounding effect of other factors.
Additionally, by recognizing the risk factors according to a taxonomy of model
capability, we reveal a set of potential deficiencies in commonsense
memorization, relational reasoning, and instruction following, which may
further provide guidance for the pretraining and supervised fine-tuning process
of LLMs to mitigate the hallucination.
Related papers
- Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate [34.17353224636788]
We argue that hallucination in MLLMs is partially due to a lack of slow-thinking and divergent-thinking in these models.
Our approach can not only hallucinations but also interpret why they occur and detail the specifics of hallucination.
arXiv Detail & Related papers (2024-07-30T02:41:32Z) - ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models [65.12177400764506]
Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications.
Current hallucination detection and mitigation datasets are limited in domains and sizes.
This paper introduces an iterative self-training framework that simultaneously and progressively scales up the hallucination annotation dataset.
arXiv Detail & Related papers (2024-07-05T17:56:38Z) - Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks.
They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences.
We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z) - Confabulation: The Surprising Value of Large Language Model Hallucinations [0.7249731529275342]
We argue that measurable semantic characteristics of LLM confabulations mirror a human propensity to utilize increased narrativity as a cognitive resource for sense-making and communication.
This finding reveals a tension in our usually dismissive understandings of confabulation.
arXiv Detail & Related papers (2024-06-06T15:32:29Z) - Exploring and Evaluating Hallucinations in LLM-Powered Code Generation [14.438161741833687]
Large Language Models (LLMs) produce outputs that deviate from users' intent, exhibit internal inconsistencies, or misalign with factual knowledge.
Existing work mainly focuses on investing the hallucination in the domain of natural language generation.
We conduct a thematic analysis of the LLM-generated code to summarize and categorize the hallucinations present in it.
We propose HalluCode, a benchmark for evaluating the performance of code LLMs in recognizing hallucinations.
arXiv Detail & Related papers (2024-04-01T07:31:45Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - Alleviating Hallucinations of Large Language Models through Induced
Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information.
We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions [40.79317187623401]
The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP)
LLMs are prone to hallucination, generating plausible yet nonfactual content.
This phenomenon raises significant concerns over the reliability of LLMs in real-world information retrieval systems.
arXiv Detail & Related papers (2023-11-09T09:25:37Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.