Detecting Hallucinated Content in Conditional Neural Sequence Generation
- URL: http://arxiv.org/abs/2011.02593v3
- Date: Wed, 2 Jun 2021 20:26:55 GMT
- Title: Detecting Hallucinated Content in Conditional Neural Sequence Generation
- Authors: Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke
Zettlemoyer, Marjan Ghazvininejad
- Abstract summary: We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input)
We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
- Score: 165.68948078624499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural sequence models can generate highly fluent sentences, but recent
studies have also shown that they are also prone to hallucinate additional
content not supported by the input. These variety of fluent but wrong outputs
are particularly problematic, as it will not be possible for users to tell they
are being presented incorrect content. To detect these errors, we propose a
task to predict whether each token in the output sequence is hallucinated (not
contained in the input) and collect new manually annotated evaluation sets for
this task. We also introduce a method for learning to detect hallucinations
using pretrained language models fine tuned on synthetic data that includes
automatically inserted hallucinations Experiments on machine translation (MT)
and abstractive summarization demonstrate that our proposed approach
consistently outperforms strong baselines on all benchmark datasets. We further
demonstrate how to use the token-level hallucination labels to define a
fine-grained loss over the target sequence in low-resource MT and achieve
significant improvements over strong baseline methods. We also apply our method
to word-level quality estimation for MT and show its effectiveness in both
supervised and unsupervised settings. Codes and data available at
https://github.com/violet-zct/fairseq-detect-hallucination.
Related papers
- Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding [14.701135083174918]
Large Vision-Language Models (LVLMs) generate detailed and coherent responses from visual inputs.
They are prone to generate hallucinations due to an over-reliance on language priors.
We propose a novel method, Summary-Guided Decoding (SGD)
arXiv Detail & Related papers (2024-10-17T08:24:27Z) - Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data [4.636499986218049]
Multimodal language models can exhibit hallucinations in their outputs, which limits their reliability.
We propose an approach to improve the sample efficiency of these models by creating corrupted grounding data.
arXiv Detail & Related papers (2024-08-30T20:11:00Z) - Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback [48.065569871444275]
We propose detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) via fine-grained AI feedback.
We generate a small-size hallucination annotation dataset by proprietary models.
Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model.
arXiv Detail & Related papers (2024-04-22T14:46:10Z) - A New Benchmark and Reverse Validation Method for Passage-level
Hallucination Detection [63.56136319976554]
Large Language Models (LLMs) generate hallucinations, which can cause significant damage when deployed for mission-critical tasks.
We propose a self-check approach based on reverse validation to detect factual errors automatically in a zero-resource fashion.
We empirically evaluate our method and existing zero-resource detection methods on two datasets.
arXiv Detail & Related papers (2023-10-10T10:14:59Z) - Reducing Hallucinations in Neural Machine Translation with Feature
Attribution [54.46113444757899]
We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT.
We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations.
We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.
arXiv Detail & Related papers (2022-11-17T20:33:56Z) - Mutual Information Alleviates Hallucinations in Abstractive
Summarization [73.48162198041884]
We find a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty.
This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, when uncertain about a continuation.
We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty.
arXiv Detail & Related papers (2022-10-24T13:30:54Z) - A Token-level Reference-free Hallucination Detection Benchmark for
Free-form Text Generation [50.55448707570669]
We propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDes.
To create this dataset, we first perturb a large number of text segments extracted from English language Wikipedia, and then verify these with crowd-sourced annotations.
arXiv Detail & Related papers (2021-04-18T04:09:48Z) - Controlling Hallucinations at Word Level in Data-to-Text Generation [10.59137381324694]
State-of-art neural models include misleading statements in their outputs.
We propose a Multi-Branch Decoder which is able to leverage word-level labels to learn the relevant parts of each training instance.
Our model is able to reduce and control hallucinations, while keeping fluency and coherence in generated texts.
arXiv Detail & Related papers (2021-02-04T18:58:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.