A Survey on Non-Autoregressive Generation for Neural Machine Translation
and Beyond
- URL: http://arxiv.org/abs/2204.09269v2
- Date: Thu, 6 Jul 2023 07:29:23 GMT
- Title: A Survey on Non-Autoregressive Generation for Neural Machine Translation
and Beyond
- Authors: Yisheng Xiao, Lijun Wu, Junliang Guo, Juntao Li, Min Zhang, Tao Qin,
Tie-yan Liu
- Abstract summary: Non-autoregressive (NAR) generation is first proposed in machine translation (NMT) to speed up inference.
While NAR generation can significantly accelerate machine translation, the inference of autoregressive (AR) generation sacrificed translation accuracy.
Many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation.
- Score: 145.43029264191543
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Non-autoregressive (NAR) generation, which is first proposed in neural
machine translation (NMT) to speed up inference, has attracted much attention
in both machine learning and natural language processing communities. While NAR
generation can significantly accelerate inference speed for machine
translation, the speedup comes at the cost of sacrificed translation accuracy
compared to its counterpart, autoregressive (AR) generation. In recent years,
many new models and algorithms have been designed/proposed to bridge the
accuracy gap between NAR generation and AR generation. In this paper, we
conduct a systematic survey with comparisons and discussions of various
non-autoregressive translation (NAT) models from different aspects.
Specifically, we categorize the efforts of NAT into several groups, including
data manipulation, modeling methods, training criterion, decoding algorithms,
and the benefit from pre-trained models. Furthermore, we briefly review other
applications of NAR models beyond machine translation, such as grammatical
error correction, text summarization, text style transfer, dialogue, semantic
parsing, automatic speech recognition, and so on. In addition, we also discuss
potential directions for future exploration, including releasing the dependency
of KD, reasonable training objectives, pre-training for NAR, and wider
applications, etc. We hope this survey can help researchers capture the latest
progress in NAR generation, inspire the design of advanced NAR models and
algorithms, and enable industry practitioners to choose appropriate solutions
for their applications. The web page of this survey is at
\url{https://github.com/LitterBrother-Xiao/Overview-of-Non-autoregressive-Applications}.
Related papers
- Transformers meet Neural Algorithmic Reasoners [16.5785372289558]
We propose a novel approach that combines the Transformer's language understanding with the robustness of graph neural network (GNN)-based neural algorithmic reasoners (NARs)
We evaluate our resulting TransNAR model on CLRS-Text, the text-based version of the CLRS-30 benchmark, and demonstrate significant gains over Transformer-only models for algorithmic reasoning.
arXiv Detail & Related papers (2024-06-13T16:42:06Z) - Directed Acyclic Transformer Pre-training for High-quality
Non-autoregressive Text Generation [98.37871690400766]
Non-AutoRegressive (NAR) text generation models have drawn much attention because of their significantly faster decoding speed and good generation quality in machine translation.
Existing NAR models lack proper pre-training, making them still far behind the pre-trained autoregressive models.
We propose Pre-trained Directed Acyclic Transformer to promote prediction consistency in NAR generation.
arXiv Detail & Related papers (2023-04-24T02:30:33Z) - Helping the Weak Makes You Strong: Simple Multi-Task Learning Improves
Non-Autoregressive Translators [35.939982651768666]
Probability framework of NAR models requires conditional independence assumption on target sequences.
We propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals.
Our approach can consistently improve accuracy of multiple NAR baselines without adding any additional decoding overhead.
arXiv Detail & Related papers (2022-11-11T09:10:14Z) - A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text
Generation [59.64193903397301]
Non-autoregressive (NAR) models simultaneously generate multiple outputs in a sequence, which significantly reduces the inference speed at the cost of accuracy drop compared to autoregressive baselines.
We conduct a comparative study of various NAR modeling methods for end-to-end automatic speech recognition (ASR)
The results on various tasks provide interesting findings for developing an understanding of NAR ASR, such as the accuracy-speed trade-off and robustness against long-form utterances.
arXiv Detail & Related papers (2021-10-11T13:05:06Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - A Study of Non-autoregressive Model for Sequence Generation [147.89525760170923]
Non-autoregressive (NAR) models generate all the tokens of a sequence in parallel.
We propose knowledge distillation and source-target alignment to bridge the gap between AR and NAR models.
arXiv Detail & Related papers (2020-04-22T09:16:09Z) - Logical Natural Language Generation from Open-Domain Tables [107.04385677577862]
We propose a new task where a model is tasked with generating natural language statements that can be emphlogically entailed by the facts.
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset citechen 2019tabfact featured with a wide range of logical/symbolic inferences.
The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order.
arXiv Detail & Related papers (2020-04-22T06:03:10Z) - ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework
for Natural Language Generation [44.21363470798758]
ERNIE-GEN is an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework.
It bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method.
It trains the model to predict semantically-complete spans consecutively rather than predicting word by word.
arXiv Detail & Related papers (2020-01-26T02:54:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.