Improving Long Context Document-Level Machine Translation
- URL: http://arxiv.org/abs/2306.05183v1
- Date: Thu, 8 Jun 2023 13:28:48 GMT
- Title: Improving Long Context Document-Level Machine Translation
- Authors: Christian Herold and Hermann Ney
- Abstract summary: Document-level context for neural machine translation (NMT) is crucial to improve translation consistency and cohesion.
Many works have been published on the topic of document-level NMT, but most restrict the system to just local context.
We propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption.
- Score: 51.359400776242786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document-level context for neural machine translation (NMT) is crucial to
improve the translation consistency and cohesion, the translation of ambiguous
inputs, as well as several other linguistic phenomena. Many works have been
published on the topic of document-level NMT, but most restrict the system to
only local context, typically including just the one or two preceding sentences
as additional information. This might be enough to resolve some ambiguous
inputs, but it is probably not sufficient to capture some document-level
information like the topic or style of a conversation. When increasing the
context size beyond just the local context, there are two challenges: (i)
the~memory usage increases exponentially (ii) the translation performance
starts to degrade. We argue that the widely-used attention mechanism is
responsible for both issues. Therefore, we propose a constrained attention
variant that focuses the attention on the most relevant parts of the sequence,
while simultaneously reducing the memory consumption. For evaluation, we
utilize targeted test sets in combination with novel evaluation techniques to
analyze the translations in regards to specific discourse-related phenomena. We
find that our approach is a good compromise between sentence-level NMT vs
attending to the full context, especially in low resource scenarios.
Related papers
- Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Discourse Centric Evaluation of Machine Translation with a Densely
Annotated Parallel Corpus [82.07304301996562]
This paper presents a new dataset with rich discourse annotations, built upon the large-scale parallel corpus BWB introduced in Jiang et al.
We investigate the similarities and differences between the discourse structures of source and target languages.
We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures.
arXiv Detail & Related papers (2023-05-18T17:36:41Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Document-aligned Japanese-English Conversation Parallel Corpus [4.793904440030568]
Sentence-level (SL) machine translation (MT) has reached acceptable quality for many high-resourced languages, but not document-level (DL) MT.
We present a document-aligned Japanese-English conversation corpus, including balanced, high-quality business conversation data for tuning and testing.
We train MT models using our corpus to demonstrate how using context leads to improvements.
arXiv Detail & Related papers (2020-12-11T06:03:33Z) - Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains.
We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.