HanoiT: Enhancing Context-aware Translation via Selective Context
- URL: http://arxiv.org/abs/2301.06825v1
- Date: Tue, 17 Jan 2023 12:07:13 GMT
- Title: HanoiT: Enhancing Context-aware Translation via Selective Context
- Authors: Jian Yang, Yuwei Yin, Shuming Ma, Liqun Yang, Hongcheng Guo, Haoyang
Huang, Dongdong Zhang, Yutao Zeng, Zhoujun Li, Furu Wei
- Abstract summary: Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
- Score: 95.93730812799798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context-aware neural machine translation aims to use the document-level
context to improve translation quality. However, not all words in the context
are helpful. The irrelevant or trivial words may bring some noise and distract
the model from learning the relationship between the current sentence and the
auxiliary context. To mitigate this problem, we propose a novel end-to-end
encoder-decoder model with a layer-wise selection mechanism to sift and refine
the long document context. To verify the effectiveness of our method, extensive
experiments and extra quantitative analysis are conducted on four
document-level machine translation benchmarks. The experimental results
demonstrate that our model significantly outperforms previous models on all
datasets via the soft selection mechanism.
Related papers
- A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning [49.62044186504516]
In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences.
Recent studies have shown that the context encoder generates noise and makes the model robust to the choice of context.
This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context.
arXiv Detail & Related papers (2024-07-03T12:50:49Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Challenges in Context-Aware Neural Machine Translation [39.89082986080746]
Context-aware neural machine translation involves leveraging information beyond sentence-level context to resolve discourse dependencies.
Despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems.
We propose a more realistic setting for document-level translation, called paragraph-to-paragraph (para2para) translation.
arXiv Detail & Related papers (2023-05-23T07:08:18Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - Context-Adaptive Document-Level Neural Machine Translation [1.52292571922932]
We introduce a data-adaptive method that enables the model to adopt the necessary and useful context.
Experiments demonstrate the proposed approach can significantly improve the performance over the previous methods with a gain up to 1.99 BLEU points.
arXiv Detail & Related papers (2021-04-16T17:43:58Z) - Context-aware Decoder for Neural Machine Translation using a Target-side
Document-Level Language Model [12.543106304662059]
We present a method to turn a sentence-level translation model into a context-aware model by incorporating a document-level language model into the decoder.
Our decoder is built upon only a sentence-level parallel corpora and monolingual corpora.
In a theoretical viewpoint, the core part of this work is the novel representation of contextual information using point-wise mutual information between context and the current sentence.
arXiv Detail & Related papers (2020-10-24T08:06:18Z) - Rethinking Document-level Neural Machine Translation [73.42052953710605]
We try to answer the question: Is the capacity of current models strong enough for document-level translation?
We observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words.
arXiv Detail & Related papers (2020-10-18T11:18:29Z) - Lexically Cohesive Neural Machine Translation with Copy Mechanism [21.43163704217968]
We employ a copy mechanism into a context-aware neural machine translation model to allow copying words from previous outputs.
We conduct experiments on Japanese to English translation using an evaluation dataset for discourse translation.
arXiv Detail & Related papers (2020-10-11T08:39:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.