Challenges in Context-Aware Neural Machine Translation
- URL: http://arxiv.org/abs/2305.13751v2
- Date: Mon, 23 Oct 2023 21:01:26 GMT
- Title: Challenges in Context-Aware Neural Machine Translation
- Authors: Linghao Jin, Jacqueline He, Jonathan May, Xuezhe Ma
- Abstract summary: Context-aware neural machine translation involves leveraging information beyond sentence-level context to resolve discourse dependencies.
Despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems.
We propose a more realistic setting for document-level translation, called paragraph-to-paragraph (para2para) translation.
- Score: 39.89082986080746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context-aware neural machine translation involves leveraging information
beyond sentence-level context to resolve inter-sentential discourse
dependencies and improve document-level translation quality, and has given rise
to a number of recent techniques. However, despite well-reasoned intuitions,
most context-aware translation models show only modest improvements over
sentence-level systems. In this work, we investigate several challenges that
impede progress within this field, relating to discourse phenomena, context
usage, model architectures, and document-level evaluation. To address these
problems, we propose a more realistic setting for document-level translation,
called paragraph-to-paragraph (para2para) translation, and collect a new
dataset of Chinese-English novels to promote future research.
Related papers
- Towards Chapter-to-Chapter Context-Aware Literary Translation via Large Language Models [16.96647110733261]
discourse phenomena in existing document-level translation datasets are sparse.
Most existing document-level corpora and context-aware machine translation methods rely on an unrealistic assumption on sentence-level alignments.
We propose a more pragmatic and challenging setting for context-aware translation, termed chapter-to-chapter (Ch2Ch) translation.
arXiv Detail & Related papers (2024-07-12T04:18:22Z) - Context-aware Neural Machine Translation for English-Japanese Business
Scene Dialogues [14.043741721036543]
This paper explores how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation.
We propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type.
We find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation.
arXiv Detail & Related papers (2023-11-20T18:06:03Z) - Improving Long Context Document-Level Machine Translation [51.359400776242786]
Document-level context for neural machine translation (NMT) is crucial to improve translation consistency and cohesion.
Many works have been published on the topic of document-level NMT, but most restrict the system to just local context.
We propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption.
arXiv Detail & Related papers (2023-06-08T13:28:48Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - Context-aware Decoder for Neural Machine Translation using a Target-side
Document-Level Language Model [12.543106304662059]
We present a method to turn a sentence-level translation model into a context-aware model by incorporating a document-level language model into the decoder.
Our decoder is built upon only a sentence-level parallel corpora and monolingual corpora.
In a theoretical viewpoint, the core part of this work is the novel representation of contextual information using point-wise mutual information between context and the current sentence.
arXiv Detail & Related papers (2020-10-24T08:06:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.