Related papers: Distinguishing Ignorance from Error in LLM Hallucinations

Distinguishing Ignorance from Error in LLM Hallucinations

URL: http://arxiv.org/abs/2410.22071v1
Date: Tue, 29 Oct 2024 14:31:33 GMT
Title: Distinguishing Ignorance from Error in LLM Hallucinations
Authors: Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov,
Abstract summary: We focus on close-book Question Answering (CBQA), where previous work has not fully addressed the distinction between two possible kinds of hallucinations. We argue that distinguishing these cases is crucial for detecting and mitigating hallucinations.
Score: 43.62904897907926
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are susceptible to hallucinations-outputs that are ungrounded, factually incorrect, or inconsistent with prior generations. We focus on close-book Question Answering (CBQA), where previous work has not fully addressed the distinction between two possible kinds of hallucinations, namely, whether the model (1) does not hold the correct answer in its parameters or (2) answers incorrectly despite having the required knowledge. We argue that distinguishing these cases is crucial for detecting and mitigating hallucinations. Specifically, case (2) may be mitigated by intervening in the model's internal computation, as the knowledge resides within the model's parameters. In contrast, in case (1) there is no parametric knowledge to leverage for mitigation, so it should be addressed by resorting to an external knowledge source or abstaining. To help distinguish between the two cases, we introduce Wrong Answer despite having Correct Knowledge (WACK), an approach for constructing model-specific datasets for the second hallucination type. Our probing experiments indicate that the two kinds of hallucinations are represented differently in the model's inner states. Next, we show that datasets constructed using WACK exhibit variations across models, demonstrating that even when models share knowledge of certain facts, they still vary in the specific examples that lead to hallucinations. Finally, we show that training a probe on our WACK datasets leads to better hallucination detection of case (2) hallucinations than using the common generic one-size-fits-all datasets. The code is available at https://github.com/technion-cs-nlp/hallucination-mitigation .

Related papers

Trust Me, I'm Wrong: High-Certainty Hallucinations in LLMs [45.13670875211498]
Large Language Models (LLMs) often generate outputs that lack grounding in real-world facts, a phenomenon known as hallucinations. We show that models can hallucinate with high certainty even when they have the correct knowledge.
arXiv Detail & Related papers (2025-02-18T15:46:31Z)
Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) Models [22.42712853647949]
We present an in-depth investigation into the object hallucination problem specifically within the CLIP model. We unveil that even in isolation, the CLIP model is prone to object hallucinations, suggesting that the hallucination problem is not solely due to the interaction between vision and language modalities. We show the the enhanced model can be employed as a visual encoder, effectively alleviating the object hallucination issue in LVLMs.
arXiv Detail & Related papers (2024-10-04T06:24:49Z)
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability [83.0884072598828]
Hallucinations come in many forms, and there is no universally accepted definition. We focus on studying only those hallucinations where a correct answer appears verbatim in the training set. We find that for a fixed dataset, larger and longer-trained LMs hallucinate less. While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations.
arXiv Detail & Related papers (2024-08-14T23:34:28Z)
Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models [65.32990889402927]
We coin this phenomenon as knowledge overshadowing'' We show that the hallucination rate grows with both the imbalance ratio and the length of dominant condition description. We propose to utilize overshadowing conditions as a signal to catch hallucination before it is produced.
arXiv Detail & Related papers (2024-07-10T20:37:42Z)
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models [59.05674402770661]
This work introduces VideoHallucer, the first comprehensive benchmark for hallucination detection in large video-language models (LVLMs) VideoHallucer categorizes hallucinations into two main types: intrinsic and extrinsic, offering further subcategories for detailed analysis.
arXiv Detail & Related papers (2024-06-24T06:21:59Z)
Mitigating Large Language Model Hallucination with Faithful Finetuning [46.33663932554782]
Large language models (LLMs) have demonstrated remarkable performance on various natural language processing tasks. They are prone to generating fluent yet untruthful responses, known as "hallucinations"
arXiv Detail & Related papers (2024-06-17T07:16:07Z)
On Large Language Models' Hallucination with Regard to Known Facts [74.96789694959894]
Large language models are successful in answering factoid questions but are also prone to hallucination. We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics. Our study shed light on understanding the reasons for LLMs' hallucinations on their known facts, and more importantly, on accurately predicting when they are hallucinating.
arXiv Detail & Related papers (2024-03-29T06:48:30Z)
Unfamiliar Finetuning Examples Control How Language Models Hallucinate [75.03210107477157]
Large language models are known to hallucinate when faced with unfamiliar queries. We find that unfamiliar examples in the models' finetuning data are crucial in shaping these errors. Our work further investigates RL finetuning strategies for improving the factuality of long-form model generations.
arXiv Detail & Related papers (2024-03-08T18:28:13Z)
Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models [11.492702369437785]
Hallucinations are semantically unrelated to the source utterance, yet still fluent and coherent. We show that commonly used metrics, such as word error rates, cannot differentiate between hallucinatory and non-hallucinatory models. We devise a framework for identifying hallucinations by analysing their semantic connection with the ground truth and their fluency.
arXiv Detail & Related papers (2024-01-03T06:56:56Z)
On Early Detection of Hallucinations in Factual Question Answering [4.76359068115052]
hallucinations remain a major impediment towards gaining user trust. In this work, we explore if the artifacts associated with the model generations can provide hints that the generation will contain hallucinations. Our results show that the distributions of these artifacts tend to differ between hallucinated and non-hallucinated generations.
arXiv Detail & Related papers (2023-12-19T14:35:04Z)
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored. We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm. Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z)
On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models? [32.41234580068662]
We conduct a study on existing knowledge-grounded conversational benchmarks and several state-of-the-art models. Standard benchmarks consist of >60% hallucinated responses, leading to models that not only hallucinate but even amplify hallucinations. Our findings raise important questions on the quality of existing datasets and models trained using them.
arXiv Detail & Related papers (2022-04-17T05:15:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.