Hallucinations in Large Multilingual Translation Models
- URL: http://arxiv.org/abs/2303.16104v1
- Date: Tue, 28 Mar 2023 16:17:59 GMT
- Title: Hallucinations in Large Multilingual Translation Models
- Authors: Nuno M. Guerreiro, Duarte Alves, Jonas Waldendorf, Barry Haddow,
Alexandra Birch, Pierre Colombo, Andr\'e F. T. Martins
- Abstract summary: Large-scale multilingual machine translation systems have demonstrated remarkable ability to translate directly between numerous languages.
When deployed in the wild, these models may generate hallucinated translations which have the potential to severely undermine user trust and raise safety concerns.
Existing research on hallucinations has primarily focused on small bilingual models trained on high-resource languages.
- Score: 70.10455226752015
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale multilingual machine translation systems have demonstrated
remarkable ability to translate directly between numerous languages, making
them increasingly appealing for real-world applications. However, when deployed
in the wild, these models may generate hallucinated translations which have the
potential to severely undermine user trust and raise safety concerns. Existing
research on hallucinations has primarily focused on small bilingual models
trained on high-resource languages, leaving a gap in our understanding of
hallucinations in massively multilingual models across diverse translation
scenarios. In this work, we fill this gap by conducting a comprehensive
analysis on both the M2M family of conventional neural machine translation
models and ChatGPT, a general-purpose large language model~(LLM) that can be
prompted for translation. Our investigation covers a broad spectrum of
conditions, spanning over 100 translation directions across various resource
levels and going beyond English-centric language pairs. We provide key insights
regarding the prevalence, properties, and mitigation of hallucinations, paving
the way towards more responsible and reliable machine translation systems.
Related papers
- Multilingual Hallucination Gaps in Large Language Models [5.505634045241288]
We study the phenomenon of hallucinations across multiple languages in freeform text generation.
These gaps reflect differences in the frequency of hallucinated answers depending on the prompt and language used.
Our results reveal variations in hallucination rates, especially between high and low resource languages.
arXiv Detail & Related papers (2024-10-23T20:41:51Z) - Mitigating Multilingual Hallucination in Large Vision-Language Models [35.75851356840673]
We propose a two-stage Multilingual Hallucination Removal (MHR) framework for Large Vision-Language Models (LVLMs)
Instead of relying on the intricate manual annotations of multilingual resources, we propose a novel cross-lingual alignment method.
Our framework delivers an average increase of 19.0% in accuracy across 13 different languages.
arXiv Detail & Related papers (2024-08-01T13:34:35Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Mitigating Hallucinations and Off-target Machine Translation with
Source-Contrastive and Language-Contrastive Decoding [53.84948040596055]
We introduce two related methods to mitigate failure cases with a modified decoding objective.
Experiments on the massively multilingual models M2M-100 (418M) and SMaLL-100 show that these methods suppress hallucinations and off-target translations.
arXiv Detail & Related papers (2023-09-13T17:15:27Z) - Searching for Needles in a Haystack: On the Role of Incidental
Bilingualism in PaLM's Translation Capability [16.01088313166145]
We investigate the role of incidental bilingualism in large language models.
We show that PaLM is exposed to over 30 million translation pairs across at least 44 languages.
We show that its presence has a substantial impact on translation capabilities, although this impact diminishes with model scale.
arXiv Detail & Related papers (2023-05-17T14:58:06Z) - Informative Language Representation Learning for Massively Multilingual
Neural Machine Translation [47.19129812325682]
In a multilingual neural machine translation model, an artificial language token is usually used to guide translation into the desired target language.
Recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions.
We propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions.
arXiv Detail & Related papers (2022-09-04T04:27:17Z) - Towards the Next 1000 Languages in Multilingual Machine Translation:
Exploring the Synergy Between Supervised and Self-Supervised Learning [48.15259834021655]
We present a pragmatic approach towards building a multilingual machine translation model that covers hundreds of languages.
We use a mixture of supervised and self-supervised objectives, depending on the data availability for different language pairs.
We demonstrate that the synergy between these two training paradigms enables the model to produce high-quality translations in the zero-resource setting.
arXiv Detail & Related papers (2022-01-09T23:36:44Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.