Using Context in Neural Machine Translation Training Objectives
- URL: http://arxiv.org/abs/2005.01483v1
- Date: Mon, 4 May 2020 13:42:30 GMT
- Title: Using Context in Neural Machine Translation Training Objectives
- Authors: Danielle Saunders, Felix Stahlberg, Bill Byrne
- Abstract summary: We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents.
We demonstrate that training is more robust for document-level metrics than with sequence metrics.
- Score: 23.176247496139574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Neural Machine Translation (NMT) training using document-level
metrics with batch-level documents. Previous sequence-objective approaches to
NMT training focus exclusively on sentence-level metrics like sentence BLEU
which do not correspond to the desired evaluation metric, typically document
BLEU. Meanwhile research into document-level NMT training focuses on data or
model architecture rather than training procedure. We find that each of these
lines of research has a clear space in it for the other, and propose merging
them with a scheme that allows a document-level evaluation metric to be used in
the NMT training objective.
We first sample pseudo-documents from sentence samples. We then approximate
the expected document BLEU gradient with Monte Carlo sampling for use as a cost
function in Minimum Risk Training (MRT). This two-level sampling procedure
gives NMT performance gains over sequence MRT and maximum-likelihood training.
We demonstrate that training is more robust for document-level metrics than
with sequence metrics. We further demonstrate improvements on NMT with TER and
Grammatical Error Correction (GEC) using GLEU, both metrics used at the
document level for evaluations.
Related papers
- Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye [15.987448306012167]
Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT)
This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT)
arXiv Detail & Related papers (2024-10-28T11:49:58Z) - Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Importance-Aware Data Augmentation for Document-Level Neural Machine
Translation [51.74178767827934]
Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive.
Due to its longer input length and limited availability of training data, DocNMT often faces the challenge of data sparsity.
We propose a novel Importance-Aware Data Augmentation (IADA) algorithm for DocNMT that augments the training data based on token importance information estimated by the norm of hidden states and training gradients.
arXiv Detail & Related papers (2024-01-27T09:27:47Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Exploring Paracrawl for Document-level Neural Machine Translation [21.923881766940088]
Document-level neural machine translation (NMT) has outperformed sentence-level NMT on a number of datasets.
We show that document-level NMT models trained with only parallel paragraphs from Paracrawl can be used to translate real documents.
arXiv Detail & Related papers (2023-04-20T11:21:34Z) - Embarrassingly Easy Document-Level MT Metrics: How to Convert Any
Pretrained Metric Into a Document-Level Metric [15.646714712131148]
We present a method for extending pretrained metrics to incorporate context at the document level.
We show that the extended metrics outperform their sentence-level counterparts in about 85% of the tested conditions.
Our experimental results support our initial hypothesis and show that a simple extension of the metrics permits them to take advantage of context.
arXiv Detail & Related papers (2022-09-27T19:42:22Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - Capturing document context inside sentence-level neural machine
translation models with self-training [5.129814362802968]
Document-level neural machine translation has received less attention and lags behind its sentence-level counterpart.
We propose an approach that doesn't require training a specialized model on parallel document-level corpora.
Our approach reinforces the choices made by the model, thus making it more likely that the same choices will be made in other sentences in the document.
arXiv Detail & Related papers (2020-03-11T12:36:17Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.