Related papers: Using Context in Neural Machine Translation Training Objectives

Using Context in Neural Machine Translation Training Objectives

URL: http://arxiv.org/abs/2005.01483v1
Date: Mon, 4 May 2020 13:42:30 GMT
Title: Using Context in Neural Machine Translation Training Objectives
Authors: Danielle Saunders, Felix Stahlberg, Bill Byrne
Abstract summary: We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents. We demonstrate that training is more robust for document-level metrics than with sequence metrics.
Score: 23.176247496139574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents. Previous sequence-objective approaches to NMT training focus exclusively on sentence-level metrics like sentence BLEU which do not correspond to the desired evaluation metric, typically document BLEU. Meanwhile research into document-level NMT training focuses on data or model architecture rather than training procedure. We find that each of these lines of research has a clear space in it for the other, and propose merging them with a scheme that allows a document-level evaluation metric to be used in the NMT training objective. We first sample pseudo-documents from sentence samples. We then approximate the expected document BLEU gradient with Monte Carlo sampling for use as a cost function in Minimum Risk Training (MRT). This two-level sampling procedure gives NMT performance gains over sequence MRT and maximum-likelihood training. We demonstrate that training is more robust for document-level metrics than with sequence metrics. We further demonstrate improvements on NMT with TER and Grammatical Error Correction (GEC) using GLEU, both metrics used at the document level for evaluations.

Related papers

Retrieval-Augmented Machine Translation with Unstructured Knowledge [74.84236945680503]
Retrieval-augmented generation (RAG) introduces additional information to enhance large language models (LLMs) In machine translation (MT), previous work typically retrieves in-context examples from paired MT corpora, or domain-specific knowledge from knowledge graphs. In this paper, we study retrieval-augmented MT using unstructured documents.
arXiv Detail & Related papers (2024-12-05T17:00:32Z)
Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye [15.987448306012167]
Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT) This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT)
arXiv Detail & Related papers (2024-10-28T11:49:58Z)
Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems. Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives. To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z)
Importance-Aware Data Augmentation for Document-Level Neural Machine Translation [51.74178767827934]
Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive. Due to its longer input length and limited availability of training data, DocNMT often faces the challenge of data sparsity. We propose a novel Importance-Aware Data Augmentation (IADA) algorithm for DocNMT that augments the training data based on token importance information estimated by the norm of hidden states and training gradients.
arXiv Detail & Related papers (2024-01-27T09:27:47Z)
On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document. In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z)
Exploring Paracrawl for Document-level Neural Machine Translation [21.923881766940088]
Document-level neural machine translation (NMT) has outperformed sentence-level NMT on a number of datasets. We show that document-level NMT models trained with only parallel paragraphs from Paracrawl can be used to translate real documents.
arXiv Detail & Related papers (2023-04-20T11:21:34Z)
Embarrassingly Easy Document-Level MT Metrics: How to Convert Any Pretrained Metric Into a Document-Level Metric [15.646714712131148]
We present a method for extending pretrained metrics to incorporate context at the document level. We show that the extended metrics outperform their sentence-level counterparts in about 85% of the tested conditions. Our experimental results support our initial hypothesis and show that a simple extension of the metrics permits them to take advantage of context.
arXiv Detail & Related papers (2022-09-27T19:42:22Z)
Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings. The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z)
Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence. We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
Capturing document context inside sentence-level neural machine translation models with self-training [5.129814362802968]
Document-level neural machine translation has received less attention and lags behind its sentence-level counterpart. We propose an approach that doesn't require training a specialized model on parallel document-level corpora. Our approach reinforces the choices made by the model, thus making it more likely that the same choices will be made in other sentences in the document.
arXiv Detail & Related papers (2020-03-11T12:36:17Z)
Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context. We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.