HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
- URL: http://arxiv.org/abs/2501.08292v1
- Date: Tue, 14 Jan 2025 18:13:08 GMT
- Title: HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
- Authors: Abhilasha Ravichander, Shrusti Ghela, David Wadden, Yejin Choi,
- Abstract summary: We release HALoGEN, a comprehensive hallucination benchmark consisting of 10,923 prompts for generative models spanning nine domains.
We use this framework to evaluate 150,000 generations from 14 language models, finding that even the best-performing models are riddled with hallucinations.
- Score: 39.678012380996854
- License:
- Abstract: Despite their impressive ability to generate high-quality and fluent text, generative large language models (LLMs) also produce hallucinations: statements that are misaligned with established world knowledge or provided input context. However, measuring hallucination can be challenging, as having humans verify model generations on-the-fly is both expensive and time-consuming. In this work, we release HALoGEN, a comprehensive hallucination benchmark consisting of: (1) 10,923 prompts for generative models spanning nine domains including programming, scientific attribution, and summarization, and (2) automatic high-precision verifiers for each use case that decompose LLM generations into atomic units, and verify each unit against a high-quality knowledge source. We use this framework to evaluate ~150,000 generations from 14 language models, finding that even the best-performing models are riddled with hallucinations (sometimes up to 86% of generated atomic facts depending on the domain). We further define a novel error classification for LLM hallucinations based on whether they likely stem from incorrect recollection of training data (Type A errors), or incorrect knowledge in training data (Type B errors), or are fabrication (Type C errors). We hope our framework provides a foundation to enable the principled study of why generative models hallucinate, and advances the development of trustworthy large language models.
Related papers
- LLMs Will Always Hallucinate, and We Need to Live With This [1.3810901729134184]
This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems.
It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms.
arXiv Detail & Related papers (2024-09-09T16:01:58Z) - Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models [65.32990889402927]
We coin this phenomenon as knowledge overshadowing''
We show that the hallucination rate grows with both the imbalance ratio and the length of dominant condition description.
We propose to utilize overshadowing conditions as a signal to catch hallucination before it is produced.
arXiv Detail & Related papers (2024-07-10T20:37:42Z) - Unfamiliar Finetuning Examples Control How Language Models Hallucinate [75.03210107477157]
Large language models are known to hallucinate when faced with unfamiliar queries.
We find that unfamiliar examples in the models' finetuning data are crucial in shaping these errors.
Our work further investigates RL finetuning strategies for improving the factuality of long-form model generations.
arXiv Detail & Related papers (2024-03-08T18:28:13Z) - Alleviating Hallucinations of Large Language Models through Induced
Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information.
We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z) - HALO: An Ontology for Representing and Categorizing Hallucinations in Large Language Models [2.9312156642007294]
Hallucination Ontology (HALO) is written in OWL and supports six different types of hallucinations known to arise in large language models (LLMs)
We publish a dataset containing hallucinations that we inductively gathered across multiple independent Web sources, and show that HALO can be successfully used to model this dataset and answer competency questions.
arXiv Detail & Related papers (2023-12-08T17:57:20Z) - Calibrated Language Models Must Hallucinate [11.891340760198798]
Recent language models generate false but plausible-sounding text with surprising frequency.
This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts.
For "arbitrary" facts whose veracity cannot be determined from the training data, we show that hallucinations must occur at a certain rate for language models.
arXiv Detail & Related papers (2023-11-24T18:29:50Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z) - Reducing Hallucinations in Neural Machine Translation with Feature
Attribution [54.46113444757899]
We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT.
We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations.
We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.
arXiv Detail & Related papers (2022-11-17T20:33:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.