Towards Reliable Neural Machine Translation with Consistency-Aware
Meta-Learning
- URL: http://arxiv.org/abs/2303.10966v2
- Date: Tue, 19 Sep 2023 07:37:31 GMT
- Title: Towards Reliable Neural Machine Translation with Consistency-Aware
Meta-Learning
- Authors: Rongxiang Weng, Qiang Wang, Wensen Cheng, Changfeng Zhu, Min Zhang
- Abstract summary: Current Neural machine translation (NMT) systems suffer from a lack of reliability.
We present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it.
We conduct experiments on the NIST Chinese to English task, three WMT translation tasks, and the TED M2O task.
- Score: 24.64700139151659
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural machine translation (NMT) has achieved remarkable success in producing
high-quality translations. However, current NMT systems suffer from a lack of
reliability, as their outputs that are often affected by lexical or syntactic
changes in inputs, resulting in large variations in quality. This limitation
hinders the practicality and trustworthiness of NMT. A contributing factor to
this problem is that NMT models trained with the one-to-one paradigm struggle
to handle the source diversity phenomenon, where inputs with the same meaning
can be expressed differently. In this work, we treat this problem as a bilevel
optimization problem and present a consistency-aware meta-learning (CAML)
framework derived from the model-agnostic meta-learning (MAML) algorithm to
address it. Specifically, the NMT model with CAML (named CoNMT) first learns a
consistent meta representation of semantically equivalent sentences in the
outer loop. Subsequently, a mapping from the meta representation to the output
sentence is learned in the inner loop, allowing the NMT model to translate
semantically equivalent sentences to the same target sentence. We conduct
experiments on the NIST Chinese to English task, three WMT translation tasks,
and the TED M2O task. The results demonstrate that CoNMT effectively improves
overall translation quality and reliably handles diverse inputs.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - An Empirical study of Unsupervised Neural Machine Translation: analyzing
NMT output, model's behavior and sentences' contribution [5.691028372215281]
Unsupervised Neural Machine Translation (UNMT) focuses on improving NMT results under the assumption there is no human translated parallel data.
We focus on three very diverse languages, French, Gujarati, and Kazakh, and train bilingual NMT models, to and from English, with various levels of supervision.
arXiv Detail & Related papers (2023-12-19T20:35:08Z) - Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality.
We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z) - Integrating Pre-trained Language Model into Neural Machine Translation [0.0]
The deficiency of high-quality bilingual language pair data poses a major challenge to improving NMT performance.
Recent studies have been exploring the use of contextual information from pre-trained language model (PLM) to address this problem.
This study proposes PLM-integrated NMT model to overcome the identified problems.
arXiv Detail & Related papers (2023-10-30T16:00:13Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Exploiting Language Relatedness in Machine Translation Through Domain
Adaptation Techniques [3.257358540764261]
We present a novel approach of using a scaled similarity score of sentences, especially for related languages based on a 5-gram KenLM language model.
Our approach succeeds in increasing 2 BLEU point on multi-domain approach, 3 BLEU point on fine-tuning for NMT and 2 BLEU point on iterative back-translation approach.
arXiv Detail & Related papers (2023-03-03T09:07:30Z) - Tackling Ambiguity with Images: Improved Multimodal Machine Translation
and Contrastive Evaluation [72.6667341525552]
We present a new MMT approach based on a strong text-only MT model, which uses neural adapters and a novel guided self-attention mechanism.
We also introduce CoMMuTE, a Contrastive Multimodal Translation Evaluation set of ambiguous sentences and their possible translations.
Our approach obtains competitive results compared to strong text-only models on standard English-to-French, English-to-German and English-to-Czech benchmarks.
arXiv Detail & Related papers (2022-12-20T10:18:18Z) - Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT [64.1841519527504]
neural machine translation uses a single neural network to model the entire translation process.
Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
arXiv Detail & Related papers (2021-09-03T09:38:50Z) - Prevent the Language Model from being Overconfident in Neural Machine
Translation [21.203435303812043]
We propose a Margin-based Objective (MTO) and a Margin-based Sentencelevel Objective (MSO) to maximize the Margin for preventing the LM from being overconfident.
Experiments on WMT14 English-to-German, WMT19 Chinese-to-English, and WMT14 English-to-French translation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2021-05-24T05:34:09Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.