Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation
- URL: http://arxiv.org/abs/2003.13205v1
- Date: Mon, 30 Mar 2020 03:38:01 GMT
- Title: Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation
- Authors: Pei Zhang, Xu Zhang, Wei Chen, Jian Yu, Yanfeng Wang, Deyi Xiong
- Abstract summary: Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
- Score: 59.191079800436114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document-level machine translation incorporates inter-sentential dependencies
into the translation of a source sentence. In this paper, we propose a new
framework to model cross-sentence dependencies by training neural machine
translation (NMT) to predict both the target translation and surrounding
sentences of a source sentence. By enforcing the NMT model to predict source
context, we want the model to learn "contextualized" source sentence
representations that capture document-level dependencies on the source side. We
further propose two different methods to learn and integrate such
contextualized sentence embeddings into NMT: a joint training method that
jointly trains an NMT model with the source context prediction model and a
pre-training & fine-tuning method that pretrains the source context prediction
model on a large-scale monolingual document corpus and then fine-tunes it with
the NMT model. Experiments on Chinese-English and English-German translation
show that both methods can substantially improve the translation quality over a
strong document-level Transformer baseline.
Related papers
- A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning [49.62044186504516]
In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences.
Recent studies have shown that the context encoder generates noise and makes the model robust to the choice of context.
This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context.
arXiv Detail & Related papers (2024-07-03T12:50:49Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT [64.1841519527504]
neural machine translation uses a single neural network to model the entire translation process.
Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
arXiv Detail & Related papers (2021-09-03T09:38:50Z) - Self-supervised and Supervised Joint Training for Resource-rich Machine
Translation [30.502625878505732]
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT)
We propose a joint training approach, $F$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models.
arXiv Detail & Related papers (2021-06-08T02:35:40Z) - Context-Adaptive Document-Level Neural Machine Translation [1.52292571922932]
We introduce a data-adaptive method that enables the model to adopt the necessary and useful context.
Experiments demonstrate the proposed approach can significantly improve the performance over the previous methods with a gain up to 1.99 BLEU points.
arXiv Detail & Related papers (2021-04-16T17:43:58Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - Capturing document context inside sentence-level neural machine
translation models with self-training [5.129814362802968]
Document-level neural machine translation has received less attention and lags behind its sentence-level counterpart.
We propose an approach that doesn't require training a specialized model on parallel document-level corpora.
Our approach reinforces the choices made by the model, thus making it more likely that the same choices will be made in other sentences in the document.
arXiv Detail & Related papers (2020-03-11T12:36:17Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.