Related papers: Escaping the sentence-level paradigm in machine translation

Escaping the sentence-level paradigm in machine translation

URL: http://arxiv.org/abs/2304.12959v2
Date: Thu, 16 May 2024 13:32:14 GMT
Title: Escaping the sentence-level paradigm in machine translation
Authors: Matt Post, Marcin Junczys-Dowmunt,
Abstract summary: Much work in document-context machine translation exists, but for various reasons has been unable to catch hold. In contrast to work on specialized architectures, we show that the standard Transformer architecture is sufficient. We propose generative variants of existing contrastive metrics that are better able to discriminate among document systems.
Score: 9.676755606927435
License: http://creativecommons.org/licenses/by/4.0/
Abstract: It is well-known that document context is vital for resolving a range of translation ambiguities, and in fact the document setting is the most natural setting for nearly all translation. It is therefore unfortunate that machine translation -- both research and production -- largely remains stuck in a decades-old sentence-level translation paradigm. It is also an increasingly glaring problem in light of competitive pressure from large language models, which are natively document-based. Much work in document-context machine translation exists, but for various reasons has been unable to catch hold. This paper suggests a path out of this rut by addressing three impediments at once: what architectures should we use? where do we get document-level information for training them? and how do we know whether they are any good? In contrast to work on specialized architectures, we show that the standard Transformer architecture is sufficient, provided it has enough capacity. Next, we address the training data issue by taking document samples from back-translated data only, where the data is not only more readily available, but is also of higher quality compared to parallel document data, which may contain machine translation output. Finally, we propose generative variants of existing contrastive metrics that are better able to discriminate among document systems. Results in four large-data language pairs (DE$\rightarrow$EN, EN$\rightarrow$DE, EN$\rightarrow$FR, and EN$\rightarrow$RU) establish the success of these three pieces together in improving document-level performance.

Related papers

Multilingual Contextualization of Large Language Models for Document-Level Machine Translation [30.005159724115824]
Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation. We propose a method to improve LLM-based long-document translation through targeted fine-tuning on high-quality document-level data. Our approach supports multiple translation paradigms, including direct document-to-document and chunk-level translation.
arXiv Detail & Related papers (2025-04-16T14:52:22Z)
Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings. First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss. Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z)
Recovering document annotations for sentence-level bitext [18.862295675088056]
We reconstruct document-level information for three datasets in German, French, Spanish, Italian, Polish, and Portuguese. We introduce a document-level filtering technique as an alternative to traditional bitext filtering. Last we train models on these longer contexts and demonstrate improvement in document-level translation without degradation of sentence-level translation.
arXiv Detail & Related papers (2024-06-06T08:58:14Z)
Document-Level Language Models for Machine Translation [37.106125892770315]
We build context-aware translation systems utilizing document-level monolingual data instead. We improve existing approaches by leveraging recent advancements in model combination. In most scenarios, back-translation gives even better results, at the cost of having to re-train the translation system.
arXiv Detail & Related papers (2023-10-18T20:10:07Z)
On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document. In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z)
HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z)
Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation [14.135048254120615]
We introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline.
arXiv Detail & Related papers (2022-05-03T14:55:53Z)
Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains. We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z)
Rethinking Document-level Neural Machine Translation [73.42052953710605]
We try to answer the question: Is the capacity of current models strong enough for document-level translation? We observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words.
arXiv Detail & Related papers (2020-10-18T11:18:29Z)
Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings. The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z)
Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context. We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.