Document Graph for Neural Machine Translation
- URL: http://arxiv.org/abs/2012.03477v2
- Date: Tue, 8 Dec 2020 07:03:52 GMT
- Title: Document Graph for Neural Machine Translation
- Authors: Mingzhou Xu, Liangyou Li, Derek. F. Wong, Qun Liu, Lidia S. Chao
- Abstract summary: We show that a document can be represented as a graph that connects relevant contexts regardless of their distances.
Experiments on various NMT benchmarks, including IWSLT English-French, Chinese-English, WMT English-German and Opensubtitle English-Russian, demonstrate that using document graphs can significantly improve the translation quality.
- Score: 42.13593962963306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous works have shown that contextual information can improve the
performance of neural machine translation (NMT). However, most existing
document-level NMT methods failed to leverage contexts beyond a few set of
previous sentences. How to make use of the whole document as global contexts is
still a challenge. To address this issue, we hypothesize that a document can be
represented as a graph that connects relevant contexts regardless of their
distances. We employ several types of relations, including adjacency, syntactic
dependency, lexical consistency, and coreference, to construct the document
graph. Then, we incorporate both source and target graphs into the conventional
Transformer architecture with graph convolutional networks. Experiments on
various NMT benchmarks, including IWSLT English-French, Chinese-English, WMT
English-German and Opensubtitle English-Russian, demonstrate that using
document graphs can significantly improve the translation quality.
Related papers
- Pretraining Language Models with Text-Attributed Heterogeneous Graphs [28.579509154284448]
We present a new pretraining framework for Language Models (LMs) that explicitly considers the topological and heterogeneous information in Text-Attributed Heterogeneous Graphs (TAHGs)
We propose a topology-aware pretraining task to predict nodes involved in the context graph by jointly optimizing an LM and an auxiliary heterogeneous graph neural network.
We conduct link prediction and node classification tasks on three datasets from various domains.
arXiv Detail & Related papers (2023-10-19T08:41:21Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Word Grounded Graph Convolutional Network [24.6338889954789]
Graph Convolutional Networks (GCNs) have shown strong performance in learning text representations for various tasks such as text classification.
We propose to transform the document graph into a word graph, to decouple data samples and a GCN model by using a document-independent graph.
The proposed Word-level Graph (WGraph) can not only implicitly learning word presentation with commonly-used word co-occurrences in corpora, but also incorporate extra global semantic dependency.
arXiv Detail & Related papers (2023-05-10T19:56:55Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains.
We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.