Related papers: Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation

Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation

URL: http://arxiv.org/abs/2302.08079v1
Date: Thu, 16 Feb 2023 04:38:34 GMT
Title: Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation
Authors: Minghao Wu, George Foster, Lizhen Qu, Gholamreza Haffari
Abstract summary: Document Flattening (DocFlat) technique integrates Flat-Batch Attention (FB) and Neural Context Gate (NCG) into Transformer model. We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation.
Score: 45.56189820979461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing work in document-level neural machine translation commonly concatenates several consecutive sentences as a pseudo-document, and then learns inter-sentential dependencies. This strategy limits the model's ability to leverage information from distant context. We overcome this limitation with a novel Document Flattening (DocFlat) technique that integrates Flat-Batch Attention (FBA) and Neural Context Gate (NCG) into Transformer model to utilize information beyond the pseudo-document boundaries. FBA allows the model to attend to all the positions in the batch and learns the relationships between positions explicitly and NCG identifies the useful information from the distant context. We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation, and validate the effectiveness of two variants of DocFlat. Empirical results show that our approach outperforms strong baselines with statistical significance on BLEU, COMET and accuracy on the contrastive test set. The analyses highlight that DocFlat is highly effective in capturing the long-range information.

Related papers

Importance-Aware Data Augmentation for Document-Level Neural Machine Translation [51.74178767827934]
Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive. Due to its longer input length and limited availability of training data, DocNMT often faces the challenge of data sparsity. We propose a novel Importance-Aware Data Augmentation (IADA) algorithm for DocNMT that augments the training data based on token importance information estimated by the norm of hidden states and training gradients.
arXiv Detail & Related papers (2024-01-27T09:27:47Z)
HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z)
SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context. We retrieve similar bilingual sentence pairs from the training corpus to augment global context. We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z)
Exploiting Global Contextual Information for Document-level Named Entity Recognition [46.99922251839363]
We propose a model called Global Context enhanced Document-level NER (GCDoc) At word-level, a document graph is constructed to model a wider range of dependencies between words. At sentence-level, for appropriately modeling wider context beyond single sentence, we employ a cross-sentence module. Our model reaches F1 score of 92.22 (93.40 with BERT) on CoNLL 2003 dataset and 88.32 (90.49 with BERT) on Ontonotes 5.0 dataset.
arXiv Detail & Related papers (2021-06-02T01:52:07Z)
Divide and Rule: Training Context-Aware Multi-Encoder Translation Models with Little Resources [20.057692375546356]
Multi-encoder models aim to improve translation quality by encoding document-level contextual information alongside the current sentence. We show that training these parameters takes large amount of data, since the contextual training signal is sparse. We propose an efficient alternative, based on splitting sentence pairs, that allows to enrich the training signal of a set of parallel sentences.
arXiv Detail & Related papers (2021-03-31T15:15:32Z)
Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling [34.93480801598084]
One document commonly contains multiple entity pairs, and one entity pair occurs multiple times in the document associated with multiple possible relations. We propose two novel techniques, adaptive thresholding and localized context pooling, to solve the multi-label and multi-entity problems.
arXiv Detail & Related papers (2020-10-21T20:41:23Z)
Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings. The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z)
Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context. We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.