Related papers: Reducing Hallucinations in Neural Machine Translation with Feature Attribution

Reducing Hallucinations in Neural Machine Translation with Feature Attribution

URL: http://arxiv.org/abs/2211.09878v2
Date: Wed, 14 Jun 2023 19:36:10 GMT
Title: Reducing Hallucinations in Neural Machine Translation with Feature Attribution
Authors: Jo\"el Tang, Marina Fomicheva, Lucia Specia
Abstract summary: We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT. We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations. We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.
Score: 54.46113444757899
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural conditional language generation models achieve the state-of-the-art in Neural Machine Translation (NMT) but are highly dependent on the quality of parallel training dataset. When trained on low-quality datasets, these models are prone to various error types, including hallucinations, i.e. outputs that are fluent, but unrelated to the source sentences. These errors are particularly dangerous, because on the surface the translation can be perceived as a correct output, especially if the reader does not understand the source language. We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT. We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations. We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.

Related papers

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling [67.14942827452161]
Vision-Language Models (VLMs) excel at visual understanding but often suffer from visual hallucinations. In this work, we introduce REVERSE, a unified framework that integrates hallucination-aware training with on-the-fly self-verification.
arXiv Detail & Related papers (2025-04-17T17:59:22Z)
Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization [1.9204566034368082]
Machine Translation systems are at a higher risk of generating hallucinations. We propose a method that intrinsically learns to mitigate hallucinations during the model training phase. Our approach reduces hallucinations by 89% on an average across three unseen target languages.
arXiv Detail & Related papers (2025-01-28T20:58:43Z)
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability [83.0884072598828]
Hallucinations come in many forms, and there is no universally accepted definition. We focus on studying only those hallucinations where a correct answer appears verbatim in the training set. We find that for a fixed dataset, larger and longer-trained LMs hallucinate less. While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations.
arXiv Detail & Related papers (2024-08-14T23:34:28Z)
Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models [65.32990889402927]
We coin this phenomenon as knowledge overshadowing'' We show that the hallucination rate grows with both the imbalance ratio and the length of dominant condition description. We propose to utilize overshadowing conditions as a signal to catch hallucination before it is produced.
arXiv Detail & Related papers (2024-07-10T20:37:42Z)
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback [48.065569871444275]
We propose detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) via fine-grained AI feedback. We generate a small-size hallucination annotation dataset by proprietary models. Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model.
arXiv Detail & Related papers (2024-04-22T14:46:10Z)
Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models [11.492702369437785]
Hallucinations are semantically unrelated to the source utterance, yet still fluent and coherent. We show that commonly used metrics, such as word error rates, cannot differentiate between hallucinatory and non-hallucinatory models. We devise a framework for identifying hallucinations by analysing their semantic connection with the ground truth and their fluency.
arXiv Detail & Related papers (2024-01-03T06:56:56Z)
Calibrated Language Models Must Hallucinate [11.891340760198798]
Recent language models generate false but plausible-sounding text with surprising frequency. This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts. For "arbitrary" facts whose veracity cannot be determined from the training data, we show that hallucinations must occur at a certain rate for language models.
arXiv Detail & Related papers (2023-11-24T18:29:50Z)
Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better [11.84762742895239]
We propose a method that evaluates the percentage of the source contribution to a generated translation. This method improves detection accuracy for the most severe hallucinations by a factor of 2 and is able to alleviate hallucinations at test time on par with the previous best approach. Next, if we move away from internal model characteristics and allow external tools, we show that using sentence similarity from cross-lingual embeddings further improves these results.
arXiv Detail & Related papers (2022-12-16T17:24:49Z)
Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation [17.102338932907294]
We set foundations for the study of NMT hallucinations. We propose DeHallucinator, a simple method for alleviating hallucinations at test time.
arXiv Detail & Related papers (2022-08-10T12:44:13Z)
Probing Causes of Hallucinations in Neural Machine Translations [51.418245676894465]
We propose to use probing methods to investigate the causes of hallucinations from the perspective of model architecture. We find that hallucination is often accompanied by the deficient encoder, especially embeddings, and vulnerable cross-attentions.
arXiv Detail & Related papers (2022-06-25T01:57:22Z)
Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input) We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.