Document Flattening: Beyond Concatenating Context for Document-Level
Neural Machine Translation
- URL: http://arxiv.org/abs/2302.08079v1
- Date: Thu, 16 Feb 2023 04:38:34 GMT
- Title: Document Flattening: Beyond Concatenating Context for Document-Level
Neural Machine Translation
- Authors: Minghao Wu, George Foster, Lizhen Qu, Gholamreza Haffari
- Abstract summary: Document Flattening (DocFlat) technique integrates Flat-Batch Attention (FB) and Neural Context Gate (NCG) into Transformer model.
We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation.
- Score: 45.56189820979461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing work in document-level neural machine translation commonly
concatenates several consecutive sentences as a pseudo-document, and then
learns inter-sentential dependencies. This strategy limits the model's ability
to leverage information from distant context. We overcome this limitation with
a novel Document Flattening (DocFlat) technique that integrates Flat-Batch
Attention (FBA) and Neural Context Gate (NCG) into Transformer model to utilize
information beyond the pseudo-document boundaries. FBA allows the model to
attend to all the positions in the batch and learns the relationships between
positions explicitly and NCG identifies the useful information from the distant
context. We conduct comprehensive experiments and analyses on three benchmark
datasets for English-German translation, and validate the effectiveness of two
variants of DocFlat. Empirical results show that our approach outperforms
strong baselines with statistical significance on BLEU, COMET and accuracy on
the contrastive test set. The analyses highlight that DocFlat is highly
effective in capturing the long-range information.
Related papers
- Importance-Aware Data Augmentation for Document-Level Neural Machine
Translation [51.74178767827934]
Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive.
Due to its longer input length and limited availability of training data, DocNMT often faces the challenge of data sparsity.
We propose a novel Importance-Aware Data Augmentation (IADA) algorithm for DocNMT that augments the training data based on token importance information estimated by the norm of hidden states and training gradients.
arXiv Detail & Related papers (2024-01-27T09:27:47Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - Exploiting Global Contextual Information for Document-level Named Entity
Recognition [46.99922251839363]
We propose a model called Global Context enhanced Document-level NER (GCDoc)
At word-level, a document graph is constructed to model a wider range of dependencies between words.
At sentence-level, for appropriately modeling wider context beyond single sentence, we employ a cross-sentence module.
Our model reaches F1 score of 92.22 (93.40 with BERT) on CoNLL 2003 dataset and 88.32 (90.49 with BERT) on Ontonotes 5.0 dataset.
arXiv Detail & Related papers (2021-06-02T01:52:07Z) - Divide and Rule: Training Context-Aware Multi-Encoder Translation Models
with Little Resources [20.057692375546356]
Multi-encoder models aim to improve translation quality by encoding document-level contextual information alongside the current sentence.
We show that training these parameters takes large amount of data, since the contextual training signal is sparse.
We propose an efficient alternative, based on splitting sentence pairs, that allows to enrich the training signal of a set of parallel sentences.
arXiv Detail & Related papers (2021-03-31T15:15:32Z) - Document-Level Relation Extraction with Adaptive Thresholding and
Localized Context Pooling [34.93480801598084]
One document commonly contains multiple entity pairs, and one entity pair occurs multiple times in the document associated with multiple possible relations.
We propose two novel techniques, adaptive thresholding and localized context pooling, to solve the multi-label and multi-entity problems.
arXiv Detail & Related papers (2020-10-21T20:41:23Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.