Modeling Discourse Structure for Document-level Neural Machine
Translation
- URL: http://arxiv.org/abs/2006.04721v1
- Date: Mon, 8 Jun 2020 16:24:03 GMT
- Title: Modeling Discourse Structure for Document-level Neural Machine
Translation
- Authors: Junxuan Chen, Xiang Li, Jiarui Zhang, Chulun Zhou, Jianwei Cui, Bin
Wang, Jinsong Su
- Abstract summary: We propose to improve document-level NMT with the aid of discourse structure information.
Specifically, we first parse the input document to obtain its discourse structure.
Then, we introduce a Transformer-based path encoder to embed the discourse structure information of each word.
Finally, we combine the discourse structure information with the word embedding before it is fed into the encoder.
- Score: 38.085454497395446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, document-level neural machine translation (NMT) has become a hot
topic in the community of machine translation. Despite its success, most of
existing studies ignored the discourse structure information of the input
document to be translated, which has shown effective in other tasks. In this
paper, we propose to improve document-level NMT with the aid of discourse
structure information. Our encoder is based on a hierarchical attention network
(HAN). Specifically, we first parse the input document to obtain its discourse
structure. Then, we introduce a Transformer-based path encoder to embed the
discourse structure information of each word. Finally, we combine the discourse
structure information with the word embedding before it is fed into the
encoder. Experimental results on the English-to-German dataset show that our
model can significantly outperform both Transformer and Transformer+HAN.
Related papers
- On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Structural Biases for Improving Transformers on Translation into
Morphologically Rich Languages [120.74406230847904]
TP-Transformer augments the traditional Transformer architecture to include an additional component to represent structure.
The second method imbues structure at the data level by segmenting the data with morphological tokenization.
We find that each of these two approaches allows the network to achieve better performance, but this improvement is dependent on the size of the dataset.
arXiv Detail & Related papers (2022-08-11T22:42:24Z) - Going Full-TILT Boogie on Document Understanding with Text-Image-Layout
Transformer [0.6702423358056857]
We introduce the TILT neural network architecture which simultaneously learns layout information, visual features, and textual semantics.
We trained our network on real-world documents with different layouts, such as tables, figures, and forms.
arXiv Detail & Related papers (2021-02-18T18:51:47Z) - Transition based Graph Decoder for Neural Machine Translation [41.7284715234202]
We propose a general Transformer-based approach for tree and graph decoding based on generating a sequence of transitions.
We show improved performance over the standard Transformer decoder, as well as over ablated versions of the model.
arXiv Detail & Related papers (2021-01-29T15:20:45Z) - Document Graph for Neural Machine Translation [42.13593962963306]
We show that a document can be represented as a graph that connects relevant contexts regardless of their distances.
Experiments on various NMT benchmarks, including IWSLT English-French, Chinese-English, WMT English-German and Opensubtitle English-Russian, demonstrate that using document graphs can significantly improve the translation quality.
arXiv Detail & Related papers (2020-12-07T06:48:59Z) - Context-aware Decoder for Neural Machine Translation using a Target-side
Document-Level Language Model [12.543106304662059]
We present a method to turn a sentence-level translation model into a context-aware model by incorporating a document-level language model into the decoder.
Our decoder is built upon only a sentence-level parallel corpora and monolingual corpora.
In a theoretical viewpoint, the core part of this work is the novel representation of contextual information using point-wise mutual information between context and the current sentence.
arXiv Detail & Related papers (2020-10-24T08:06:18Z) - Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains.
We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Bi-Decoder Augmented Network for Neural Machine Translation [108.3931242633331]
We propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task.
Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space.
arXiv Detail & Related papers (2020-01-14T02:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.