Diverse Pretrained Context Encodings Improve Document Translation
- URL: http://arxiv.org/abs/2106.03717v1
- Date: Mon, 7 Jun 2021 15:28:01 GMT
- Title: Diverse Pretrained Context Encodings Improve Document Translation
- Authors: Domenic Donato, Lei Yu, Chris Dyer
- Abstract summary: We propose a new architecture for adapting a sentence-level sequence-to-sequence transformer incorporating multiple pretrained document context signals.
Our best multi-context model consistently outperforms the best existing context-aware transformers.
- Score: 31.03899564183553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new architecture for adapting a sentence-level
sequence-to-sequence transformer by incorporating multiple pretrained document
context signals and assess the impact on translation performance of (1)
different pretraining approaches for generating these signals, (2) the quantity
of parallel data for which document context is available, and (3) conditioning
on source, target, or source and target contexts. Experiments on the NIST
Chinese-English, and IWSLT and WMT English-German tasks support four general
conclusions: that using pretrained context representations markedly improves
sample efficiency, that adequate parallel data resources are crucial for
learning to use document context, that jointly conditioning on multiple context
representations outperforms any single representation, and that source context
is more valuable for translation performance than target side context. Our best
multi-context model consistently outperforms the best existing context-aware
transformers.
Related papers
- Sequence Shortening for Context-Aware Machine Translation [5.803309695504831]
We show that a special case of multi-encoder architecture achieves higher accuracy on contrastive datasets.
We introduce two novel methods - Latent Grouping and Latent Selecting, where the network learns to group tokens or selects the tokens to be cached as context.
arXiv Detail & Related papers (2024-02-02T13:55:37Z) - Shiftable Context: Addressing Training-Inference Context Mismatch in
Simultaneous Speech Translation [0.17188280334580192]
Transformer models using segment-based processing have been an effective architecture for simultaneous speech translation.
We propose Shiftable Context to ensure consistent segment and context sizes are maintained throughout training and inference.
arXiv Detail & Related papers (2023-07-03T22:11:51Z) - Dual-Alignment Pre-training for Cross-lingual Sentence Embedding [79.98111074307657]
We propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding.
We introduce a novel representation translation learning (RTL) task, where the model learns to use one-side contextualized token representation to reconstruct its translation counterpart.
Our approach can significantly improve sentence embedding.
arXiv Detail & Related papers (2023-05-16T03:53:30Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - Context-Adaptive Document-Level Neural Machine Translation [1.52292571922932]
We introduce a data-adaptive method that enables the model to adopt the necessary and useful context.
Experiments demonstrate the proposed approach can significantly improve the performance over the previous methods with a gain up to 1.99 BLEU points.
arXiv Detail & Related papers (2021-04-16T17:43:58Z) - Divide and Rule: Training Context-Aware Multi-Encoder Translation Models
with Little Resources [20.057692375546356]
Multi-encoder models aim to improve translation quality by encoding document-level contextual information alongside the current sentence.
We show that training these parameters takes large amount of data, since the contextual training signal is sparse.
We propose an efficient alternative, based on splitting sentence pairs, that allows to enrich the training signal of a set of parallel sentences.
arXiv Detail & Related papers (2021-03-31T15:15:32Z) - Meta Back-translation [111.87397401837286]
We propose a novel method to generate pseudo-parallel data from a pre-trained back-translation model.
Our method is a meta-learning algorithm which adapts a pre-trained back-translation model so that the pseudo-parallel data it generates would train a forward-translation model to do well on a validation set.
arXiv Detail & Related papers (2021-02-15T20:58:32Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.