Document-level Neural Machine Translation with Document Embeddings
- URL: http://arxiv.org/abs/2009.08775v1
- Date: Wed, 16 Sep 2020 19:43:29 GMT
- Title: Document-level Neural Machine Translation with Document Embeddings
- Authors: Shu Jiang, Hai Zhao, Zuchao Li, Bao-Liang Lu
- Abstract summary: This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
- Score: 82.4684444847092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard neural machine translation (NMT) is on the assumption of
document-level context independent. Most existing document-level NMT methods
are satisfied with a smattering sense of brief document-level information,
while this work focuses on exploiting detailed document-level context in terms
of multiple forms of document embeddings, which is capable of sufficiently
modeling deeper and richer document-level context. The proposed document-aware
NMT is implemented to enhance the Transformer baseline by introducing both
global and local document-level clues on the source end. Experiments show that
the proposed method significantly improves the translation performance over
strong baselines and other related studies.
Related papers
- On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Discourse Cohesion Evaluation for Document-Level Neural Machine
Translation [36.96887050831173]
It is well known that translations generated by an excellent document-level neural machine translation (NMT) model are consistent and coherent.
Existing sentence-level evaluation metrics like BLEU can hardly reflect the model's performance at the document level.
We propose a new test suite that considers four cohesive manners to measure the cohesiveness of document translations.
arXiv Detail & Related papers (2022-08-19T01:56:00Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - A Comparison of Approaches to Document-level Machine Translation [34.2276281264886]
This paper presents a systematic comparison of selected approaches to document-level phenomena evaluation suites.
We find that a simple method based purely on back-translating monolingual document-level data performs as well as much more elaborate alternatives.
arXiv Detail & Related papers (2021-01-26T19:21:09Z) - Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains.
We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z) - SPECTER: Document-level Representation Learning using Citation-informed
Transformers [51.048515757909215]
SPECTER generates document-level embedding of scientific documents based on pretraining a Transformer language model.
We introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction to document classification and recommendation.
arXiv Detail & Related papers (2020-04-15T16:05:51Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.