Related papers: A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

URL: http://arxiv.org/abs/2204.09269v2
Date: Thu, 6 Jul 2023 07:29:23 GMT
Title: A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
Authors: Yisheng Xiao, Lijun Wu, Junliang Guo, Juntao Li, Min Zhang, Tao Qin, Tie-yan Liu
Abstract summary: Non-autoregressive (NAR) generation is first proposed in machine translation (NMT) to speed up inference. While NAR generation can significantly accelerate machine translation, the inference of autoregressive (AR) generation sacrificed translation accuracy. Many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation.
Score: 145.43029264191543
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing communities. While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, autoregressive (AR) generation. In recent years, many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation. In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects. Specifically, we categorize the efforts of NAT into several groups, including data manipulation, modeling methods, training criterion, decoding algorithms, and the benefit from pre-trained models. Furthermore, we briefly review other applications of NAR models beyond machine translation, such as grammatical error correction, text summarization, text style transfer, dialogue, semantic parsing, automatic speech recognition, and so on. In addition, we also discuss potential directions for future exploration, including releasing the dependency of KD, reasonable training objectives, pre-training for NAR, and wider applications, etc. We hope this survey can help researchers capture the latest progress in NAR generation, inspire the design of advanced NAR models and algorithms, and enable industry practitioners to choose appropriate solutions for their applications. The web page of this survey is at \url{https://github.com/LitterBrother-Xiao/Overview-of-Non-autoregressive-Applications}.

Related papers

Transformers meet Neural Algorithmic Reasoners [16.5785372289558]
We propose a novel approach that combines the Transformer's language understanding with the robustness of graph neural network (GNN)-based neural algorithmic reasoners (NARs) We evaluate our resulting TransNAR model on CLRS-Text, the text-based version of the CLRS-30 benchmark, and demonstrate significant gains over Transformer-only models for algorithmic reasoning.
arXiv Detail & Related papers (2024-06-13T16:42:06Z)
Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation [98.37871690400766]
Non-AutoRegressive (NAR) text generation models have drawn much attention because of their significantly faster decoding speed and good generation quality in machine translation. Existing NAR models lack proper pre-training, making them still far behind the pre-trained autoregressive models. We propose Pre-trained Directed Acyclic Transformer to promote prediction consistency in NAR generation.
arXiv Detail & Related papers (2023-04-24T02:30:33Z)
Helping the Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators [35.939982651768666]
Probability framework of NAR models requires conditional independence assumption on target sequences. We propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals. Our approach can consistently improve accuracy of multiple NAR baselines without adding any additional decoding overhead.
arXiv Detail & Related papers (2022-11-11T09:10:14Z)
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation [59.64193903397301]
Non-autoregressive (NAR) models simultaneously generate multiple outputs in a sequence, which significantly reduces the inference speed at the cost of accuracy drop compared to autoregressive baselines. We conduct a comparative study of various NAR modeling methods for end-to-end automatic speech recognition (ASR) The results on various tasks provide interesting findings for developing an understanding of NAR ASR, such as the accuracy-speed trade-off and robustness against long-form utterances.
arXiv Detail & Related papers (2021-10-11T13:05:06Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
A Study of Non-autoregressive Model for Sequence Generation [147.89525760170923]
Non-autoregressive (NAR) models generate all the tokens of a sequence in parallel. We propose knowledge distillation and source-target alignment to bridge the gap between AR and NAR models.
arXiv Detail & Related papers (2020-04-22T09:16:09Z)
Logical Natural Language Generation from Open-Domain Tables [107.04385677577862]
We propose a new task where a model is tasked with generating natural language statements that can be emphlogically entailed by the facts. To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset citechen 2019tabfact featured with a wide range of logical/symbolic inferences. The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order.
arXiv Detail & Related papers (2020-04-22T06:03:10Z)
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation [44.21363470798758]
ERNIE-GEN is an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework. It bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. It trains the model to predict semantically-complete spans consecutively rather than predicting word by word.
arXiv Detail & Related papers (2020-01-26T02:54:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.