Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information
- URL: http://arxiv.org/abs/2109.05712v1
- Date: Mon, 13 Sep 2021 05:18:47 GMT
- Title: Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information
- Authors: Yongkeun Hwang, Hyungu Yun, Kyomin Jung
- Abstract summary: We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
- Score: 14.671424999873812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context-aware neural machine translation (NMT) incorporates contextual
information of surrounding texts, that can improve the translation quality of
document-level machine translation. Many existing works on context-aware NMT
have focused on developing new model architectures for incorporating additional
contexts and have shown some promising results. However, most existing works
rely on cross-entropy loss, resulting in limited use of contextual information.
In this paper, we propose CorefCL, a novel data augmentation and contrastive
learning scheme based on coreference between the source and contextual
sentences. By corrupting automatically detected coreference mentions in the
contextual sentence, CorefCL can train the model to be sensitive to coreference
inconsistency. We experimented with our method on common context-aware NMT
models and two document-level translation tasks. In the experiments, our method
consistently improved BLEU of compared models on English-German and
English-Korean tasks. We also show that our method significantly improves
coreference resolution in the English-German contrastive test suite.
Related papers
- A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning [49.62044186504516]
In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences.
Recent studies have shown that the context encoder generates noise and makes the model robust to the choice of context.
This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context.
arXiv Detail & Related papers (2024-07-03T12:50:49Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by
Leveraging Unstructured Context in Neural Machine Translation [3.703767478524629]
This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text.
MTCue learns an abstract representation of context, enabling transferability across different data settings.
MTCue significantly outperforms a "tagging" baseline at translating English text.
arXiv Detail & Related papers (2023-05-25T10:06:08Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - Divide and Rule: Training Context-Aware Multi-Encoder Translation Models
with Little Resources [20.057692375546356]
Multi-encoder models aim to improve translation quality by encoding document-level contextual information alongside the current sentence.
We show that training these parameters takes large amount of data, since the contextual training signal is sparse.
We propose an efficient alternative, based on splitting sentence pairs, that allows to enrich the training signal of a set of parallel sentences.
arXiv Detail & Related papers (2021-03-31T15:15:32Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.