Related papers: Non-Autoregressive Document-Level Machine Translation

Non-Autoregressive Document-Level Machine Translation

URL: http://arxiv.org/abs/2305.12878v3
Date: Sat, 9 Dec 2023 11:31:32 GMT
Title: Non-Autoregressive Document-Level Machine Translation
Authors: Guangsheng Bao, Zhiyang Teng, Hao Zhou, Jianhao Yan, Yue Zhang
Abstract summary: Non-autoregressive translation (NAT) models achieve comparable performance and superior speed compared to auto-regressive translation (AT) models. However, their abilities are unexplored in document-level machine translation (MT) We propose a simple but effective design of sentence alignment between source and target.
Score: 35.48195990457836
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Non-autoregressive translation (NAT) models achieve comparable performance and superior speed compared to auto-regressive translation (AT) models in the context of sentence-level machine translation (MT). However, their abilities are unexplored in document-level MT, hindering their usage in real scenarios. In this paper, we conduct a comprehensive examination of typical NAT models in the context of document-level MT and further propose a simple but effective design of sentence alignment between source and target. Experiments show that NAT models achieve high acceleration on documents, and sentence alignment significantly enhances their performance. However, current NAT models still have a significant performance gap compared to their AT counterparts. Further investigation reveals that NAT models suffer more from the multi-modality and misalignment issues in the context of document-level MT, and current NAT models struggle with exploiting document context and handling discourse phenomena. We delve into these challenges and provide our code at \url{https://github.com/baoguangsheng/nat-on-doc}.

Related papers

Revisiting Non-Autoregressive Translation at Scale [76.93869248715664]
We systematically study the impact of scaling on non-autoregressive translation (NAT) behaviors. We show that scaling can alleviate the commonly-cited weaknesses of NAT models, resulting in better translation performance. We establish a new benchmark by validating scaled NAT models on a scaled dataset.
arXiv Detail & Related papers (2023-05-25T15:22:47Z)
Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order. In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z)
Candidate Soups: Fusing Candidate Results Improves Translation Quality for Non-Autoregressive Translation [15.332496335303189]
Non-autoregressive translation (NAT) model achieves a much faster inference speed than the autoregressive translation (AT) model. Existing NAT methods only focus on improving the NAT model's performance but do not fully utilize it. We propose a simple but effective method called "Candidate Soups," which can obtain high-quality translations.
arXiv Detail & Related papers (2023-01-27T02:39:42Z)
When Does Translation Require Context? A Data-driven, Multilingual Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT) Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation. We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z)
Modeling Coverage for Non-Autoregressive Neural Machine Translation [9.173385214565451]
We propose a novel Coverage-NAT to model the coverage information directly by a token-level coverage iterative refinement mechanism and a sentence-level coverage agreement. Experimental results on WMT14 En-De and WMT16 En-Ro translation tasks show that our method can alleviate those errors and achieve strong improvements over the baseline system.
arXiv Detail & Related papers (2021-04-24T07:33:23Z)
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation [32.77372312124259]
Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. We propose to adopt Multi-Task learning to transfer the Autoregressive machine Translation knowledge to NAT models through encoder sharing. Experimental results on WMT14 English-German and WMT16 English-Romanian datasets show that the proposed Multi-Task NAT achieves significant improvements over the baseline NAT models.
arXiv Detail & Related papers (2020-10-24T11:00:58Z)
Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation [188.3605563567253]
Non-autoregressive translation (NAT) achieves faster inference speed but at the cost of worse accuracy compared with autoregressive translation (AT) We introduce semi-autoregressive translation (SAT) as intermediate tasks. SAT covers AT and NAT as its special cases. We design curriculum schedules to gradually shift k from 1 to N, with different pacing functions and number of tasks trained at the same time. Experiments on IWSLT14 De-En, IWSLT16 En-De, WMT14 En-De and De-En datasets show that TCL-NAT achieves significant accuracy improvements over previous NAT baseline
arXiv Detail & Related papers (2020-07-17T06:06:54Z)
LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention [54.18121922040521]
Non-autoregressive translation (NAT) models generate multiple tokens in one forward pass. These NAT models often suffer from the multimodality problem, generating duplicated tokens or missing tokens. We propose two novel methods to address this issue, the Look-Around (LA) strategy and the Vocabulary Attention (VA) mechanism.
arXiv Detail & Related papers (2020-02-08T04:11:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.