Importance-Aware Data Augmentation for Document-Level Neural Machine
Translation
- URL: http://arxiv.org/abs/2401.15360v1
- Date: Sat, 27 Jan 2024 09:27:47 GMT
- Title: Importance-Aware Data Augmentation for Document-Level Neural Machine
Translation
- Authors: Minghao Wu, Yufei Wang, George Foster, Lizhen Qu, Gholamreza Haffari
- Abstract summary: Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive.
Due to its longer input length and limited availability of training data, DocNMT often faces the challenge of data sparsity.
We propose a novel Importance-Aware Data Augmentation (IADA) algorithm for DocNMT that augments the training data based on token importance information estimated by the norm of hidden states and training gradients.
- Score: 51.74178767827934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document-level neural machine translation (DocNMT) aims to generate
translations that are both coherent and cohesive, in contrast to its
sentence-level counterpart. However, due to its longer input length and limited
availability of training data, DocNMT often faces the challenge of data
sparsity. To overcome this issue, we propose a novel Importance-Aware Data
Augmentation (IADA) algorithm for DocNMT that augments the training data based
on token importance information estimated by the norm of hidden states and
training gradients. We conduct comprehensive experiments on three widely-used
DocNMT benchmarks. Our empirical results show that our proposed IADA
outperforms strong DocNMT baselines as well as several data augmentation
approaches, with statistical significance on both sentence-level and
document-level BLEU.
Related papers
- Towards Inducing Document-Level Abilities in Standard Multilingual Neural Machine Translation Models [4.625277907331917]
This work addresses the challenge of transitioning pre-trained NMT models from absolute sinusoidal PEs to relative PEs.
We demonstrate that parameter-efficient fine-tuning, using only a small amount of high-quality data, can successfully facilitate this transition.
We find that a small amount of long-context data in a few languages is sufficient for cross-lingual length generalization.
arXiv Detail & Related papers (2024-08-21T07:23:34Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Semi-supervised Neural Machine Translation with Consistency
Regularization for Low-Resource Languages [3.475371300689165]
This paper presents a simple yet effective method to tackle the problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner.
Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences.
Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores.
arXiv Detail & Related papers (2023-04-02T15:24:08Z) - Document Flattening: Beyond Concatenating Context for Document-Level
Neural Machine Translation [45.56189820979461]
Document Flattening (DocFlat) technique integrates Flat-Batch Attention (FB) and Neural Context Gate (NCG) into Transformer model.
We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation.
arXiv Detail & Related papers (2023-02-16T04:38:34Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Alternated Training with Synthetic and Authentic Data for Neural Machine
Translation [49.35605028467887]
We propose alternated training with synthetic and authentic data for neural machine translation (NMT)
Compared with previous work, we introduce authentic data as guidance to prevent the training of NMT models from being disturbed by noisy synthetic data.
Experiments on Chinese-English and German-English translation tasks show that our approach improves the performance over several strong baselines.
arXiv Detail & Related papers (2021-06-16T07:13:16Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Understanding Learning Dynamics for Neural Machine Translation [53.23463279153577]
We propose to understand learning dynamics of NMT by using Loss Change Allocation (LCA)citeplan 2019-loss-change-allocation.
As LCA requires calculating the gradient on an entire dataset for each update, we instead present an approximate to put it into practice in NMT scenario.
Our simulated experiment shows that such approximate calculation is efficient and is empirically proved to deliver consistent results.
arXiv Detail & Related papers (2020-04-05T13:32:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.