Dual Past and Future for Neural Machine Translation
- URL: http://arxiv.org/abs/2007.07728v2
- Date: Fri, 17 Jul 2020 03:51:43 GMT
- Title: Dual Past and Future for Neural Machine Translation
- Authors: Jianhao Yan, Fandong Meng, Jie Zhou
- Abstract summary: We present a novel dual framework that leverages both source-to-target and target-to-source NMT models to provide a more direct and accurate supervision signal for the Past and Future modules.
Experimental results demonstrate that our proposed method significantly improves the adequacy of NMT predictions and surpasses previous methods in two well-studied translation tasks.
- Score: 51.418245676894465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Though remarkable successes have been achieved by Neural Machine Translation
(NMT) in recent years, it still suffers from the inadequate-translation
problem. Previous studies show that explicitly modeling the Past and Future
contents of the source sentence is beneficial for translation performance.
However, it is not clear whether the commonly used heuristic objective is good
enough to guide the Past and Future. In this paper, we present a novel dual
framework that leverages both source-to-target and target-to-source NMT models
to provide a more direct and accurate supervision signal for the Past and
Future modules. Experimental results demonstrate that our proposed method
significantly improves the adequacy of NMT predictions and surpasses previous
methods in two well-studied translation tasks.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - An Empirical study of Unsupervised Neural Machine Translation: analyzing
NMT output, model's behavior and sentences' contribution [5.691028372215281]
Unsupervised Neural Machine Translation (UNMT) focuses on improving NMT results under the assumption there is no human translated parallel data.
We focus on three very diverse languages, French, Gujarati, and Kazakh, and train bilingual NMT models, to and from English, with various levels of supervision.
arXiv Detail & Related papers (2023-12-19T20:35:08Z) - Extending Multilingual Machine Translation through Imitation Learning [60.15671816513614]
Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert.
We show that our approach significantly improves the translation performance between the new and the original languages.
We also demonstrate that our approach is capable of solving copy and off-target problems.
arXiv Detail & Related papers (2023-11-14T21:04:03Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on
User-Generated Contents [40.25277134147149]
We present a new dataset, PheMT, for evaluating the robustness of MT systems against specific linguistic phenomena in Japanese-English translation.
Our experiments with the created dataset revealed that not only our in-house models but even widely used off-the-shelf systems are greatly disturbed by the presence of certain phenomena.
arXiv Detail & Related papers (2020-11-04T04:44:47Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - A Comprehensive Survey of Multilingual Neural Machine Translation [22.96845346423759]
We present a survey on multilingual neural machine translation (MNMT)
MNMT is more promising than its statistical machine translation counterpart because end-to-end modeling and distributed representations open new avenues for research on machine translation.
We first categorize various approaches based on their central use-case and then further categorize them based on resource scenarios, underlying modeling principles, core-issues and challenges.
arXiv Detail & Related papers (2020-01-04T19:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.