Related papers: Document Graph for Neural Machine Translation

Document Graph for Neural Machine Translation

URL: http://arxiv.org/abs/2012.03477v2
Date: Tue, 8 Dec 2020 07:03:52 GMT
Title: Document Graph for Neural Machine Translation
Authors: Mingzhou Xu, Liangyou Li, Derek. F. Wong, Qun Liu, Lidia S. Chao
Abstract summary: We show that a document can be represented as a graph that connects relevant contexts regardless of their distances. Experiments on various NMT benchmarks, including IWSLT English-French, Chinese-English, WMT English-German and Opensubtitle English-Russian, demonstrate that using document graphs can significantly improve the translation quality.
Score: 42.13593962963306
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous works have shown that contextual information can improve the performance of neural machine translation (NMT). However, most existing document-level NMT methods failed to leverage contexts beyond a few set of previous sentences. How to make use of the whole document as global contexts is still a challenge. To address this issue, we hypothesize that a document can be represented as a graph that connects relevant contexts regardless of their distances. We employ several types of relations, including adjacency, syntactic dependency, lexical consistency, and coreference, to construct the document graph. Then, we incorporate both source and target graphs into the conventional Transformer architecture with graph convolutional networks. Experiments on various NMT benchmarks, including IWSLT English-French, Chinese-English, WMT English-German and Opensubtitle English-Russian, demonstrate that using document graphs can significantly improve the translation quality.

Related papers

Pretraining Language Models with Text-Attributed Heterogeneous Graphs [28.579509154284448]
We present a new pretraining framework for Language Models (LMs) that explicitly considers the topological and heterogeneous information in Text-Attributed Heterogeneous Graphs (TAHGs) We propose a topology-aware pretraining task to predict nodes involved in the context graph by jointly optimizing an LM and an auxiliary heterogeneous graph neural network. We conduct link prediction and node classification tasks on three datasets from various domains.
arXiv Detail & Related papers (2023-10-19T08:41:21Z)
On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document. In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z)
Word Grounded Graph Convolutional Network [24.6338889954789]
Graph Convolutional Networks (GCNs) have shown strong performance in learning text representations for various tasks such as text classification. We propose to transform the document graph into a word graph, to decouple data samples and a GCN model by using a document-independent graph. The proposed Word-level Graph (WGraph) can not only implicitly learning word presentation with commonly-used word co-occurrences in corpora, but also incorporate extra global semantic dependency.
arXiv Detail & Related papers (2023-05-10T19:56:55Z)
HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z)
Contrastive Learning for Context-aware Neural Machine TranslationUsing Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences. By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency. In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z)
Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains. We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z)
Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings. The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z)
Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence. We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context. We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.