The Curious Case of Hallucinations in Neural Machine Translation
- URL: http://arxiv.org/abs/2104.06683v1
- Date: Wed, 14 Apr 2021 08:09:57 GMT
- Title: The Curious Case of Hallucinations in Neural Machine Translation
- Authors: Vikas Raunak, Arul Menezes and Marcin Junczys-Dowmunt
- Abstract summary: hallucinations in Neural Machine Translation lie at an extreme end on the spectrum of NMT pathologies.
We consider hallucinations under corpus-level noise (without any source perturbation) and demonstrate that two prominent types of natural hallucinations could be generated and explained through specific corpus-level noise patterns.
We elucidate the phenomenon of hallucination amplification in popular data-generation processes such as Backtranslation and sequence-level Knowledge Distillation.
- Score: 5.3180458405676205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we study hallucinations in Neural Machine Translation (NMT),
which lie at an extreme end on the spectrum of NMT pathologies. Firstly, we
connect the phenomenon of hallucinations under source perturbation to the
Long-Tail theory of Feldman (2020), and present an empirically validated
hypothesis that explains hallucinations under source perturbation. Secondly, we
consider hallucinations under corpus-level noise (without any source
perturbation) and demonstrate that two prominent types of natural
hallucinations (detached and oscillatory outputs) could be generated and
explained through specific corpus-level noise patterns. Finally, we elucidate
the phenomenon of hallucination amplification in popular data-generation
processes such as Backtranslation and sequence-level Knowledge Distillation.
Related papers
- Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models [65.32990889402927]
We coin this phenomenon as knowledge overshadowing''
We show that the hallucination rate grows with both the imbalance ratio and the length of dominant condition description.
We propose to utilize overshadowing conditions as a signal to catch hallucination before it is produced.
arXiv Detail & Related papers (2024-07-10T20:37:42Z) - On Large Language Models' Hallucination with Regard to Known Facts [74.96789694959894]
Large language models are successful in answering factoid questions but are also prone to hallucination.
We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics.
Our study shed light on understanding the reasons for LLMs' hallucinations on their known facts, and more importantly, on accurately predicting when they are hallucinating.
arXiv Detail & Related papers (2024-03-29T06:48:30Z) - Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations [42.46721214112836]
State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge.
We create diagnostic datasets with subject-relation queries and adapt interpretability methods to trace hallucinations through internal model representations.
arXiv Detail & Related papers (2024-03-27T00:23:03Z) - Hallucinations in Neural Automatic Speech Recognition: Identifying
Errors and Hallucinatory Models [11.492702369437785]
Hallucinations are semantically unrelated to the source utterance, yet still fluent and coherent.
We show that commonly used metrics, such as word error rates, cannot differentiate between hallucinatory and non-hallucinatory models.
We devise a framework for identifying hallucinations by analysing their semantic connection with the ground truth and their fluency.
arXiv Detail & Related papers (2024-01-03T06:56:56Z) - On Early Detection of Hallucinations in Factual Question Answering [4.76359068115052]
hallucinations remain a major impediment towards gaining user trust.
In this work, we explore if the artifacts associated with the model generations can provide hints that the generation will contain hallucinations.
Our results show that the distributions of these artifacts tend to differ between hallucinated and non-hallucinated generations.
arXiv Detail & Related papers (2023-12-19T14:35:04Z) - Understanding and Detecting Hallucinations in Neural Machine Translation
via Model Introspection [28.445196622710164]
We first identify internal model symptoms of hallucinations by analyzing the relative token contributions to the generation in contrastive hallucinated vs. non-hallucinated outputs generated via source perturbations.
We then show that these symptoms are reliable indicators of natural hallucinations, by using them to design a lightweight hallucination detector.
arXiv Detail & Related papers (2023-01-18T20:43:13Z) - Reducing Hallucinations in Neural Machine Translation with Feature
Attribution [54.46113444757899]
We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT.
We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations.
We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.
arXiv Detail & Related papers (2022-11-17T20:33:56Z) - Probing Causes of Hallucinations in Neural Machine Translations [51.418245676894465]
We propose to use probing methods to investigate the causes of hallucinations from the perspective of model architecture.
We find that hallucination is often accompanied by the deficient encoder, especially embeddings, and vulnerable cross-attentions.
arXiv Detail & Related papers (2022-06-25T01:57:22Z) - On Hallucination and Predictive Uncertainty in Conditional Language
Generation [76.18783678114325]
Higher predictive uncertainty corresponds to a higher chance of hallucination.
Epistemic uncertainty is more indicative of hallucination than aleatoric or total uncertainties.
It helps to achieve better results of trading performance in standard metric for less hallucination with the proposed beam search variant.
arXiv Detail & Related papers (2021-03-28T00:32:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.