ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework
for Natural Language Generation
- URL: http://arxiv.org/abs/2001.11314v3
- Date: Mon, 8 Jun 2020 08:09:33 GMT
- Title: ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework
for Natural Language Generation
- Authors: Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu and
Haifeng Wang
- Abstract summary: ERNIE-GEN is an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework.
It bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method.
It trains the model to predict semantically-complete spans consecutively rather than predicting word by word.
- Score: 44.21363470798758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current pre-training works in natural language generation pay little
attention to the problem of exposure bias on downstream tasks. To address this
issue, we propose an enhanced multi-flow sequence to sequence pre-training and
fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between
training and inference with an infilling generation mechanism and a noise-aware
generation method. To make generation closer to human writing patterns, this
framework introduces a span-by-span generation flow that trains the model to
predict semantically-complete spans consecutively rather than predicting word
by word. Unlike existing pre-training methods, ERNIE-GEN incorporates
multi-granularity target sampling to construct pre-training data, which
enhances the correlation between encoder and decoder. Experimental results
demonstrate that ERNIE-GEN achieves state-of-the-art results with a much
smaller amount of pre-training data and parameters on a range of language
generation tasks, including abstractive summarization (Gigaword and
CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat)
and generative question answering (CoQA).
Related papers
- Generative Pre-training for Speech with Flow Matching [81.59952572752248]
We pre-trained a generative model, named SpeechFlow, on 60k hours of untranscribed speech with Flow Matching and masked conditions.
Experiment results show the pre-trained generative model can be fine-tuned with task-specific data to match or surpass existing expert models on speech enhancement, separation, and synthesis.
arXiv Detail & Related papers (2023-10-25T03:40:50Z) - A Survey on Non-Autoregressive Generation for Neural Machine Translation
and Beyond [145.43029264191543]
Non-autoregressive (NAR) generation is first proposed in machine translation (NMT) to speed up inference.
While NAR generation can significantly accelerate machine translation, the inference of autoregressive (AR) generation sacrificed translation accuracy.
Many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation.
arXiv Detail & Related papers (2022-04-20T07:25:22Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - QURIOUS: Question Generation Pretraining for Text Generation [13.595014409069584]
We propose question generation as a pretraining method, which better aligns with the text generation objectives.
Our text generation models pretrained with this method are better at understanding the essence of the input and are better language models for the target task.
arXiv Detail & Related papers (2020-04-23T08:41:52Z) - PALM: Pre-training an Autoencoding&Autoregressive Language Model for
Context-conditioned Generation [92.7366819044397]
Self-supervised pre-training has emerged as a powerful technique for natural language understanding and generation.
This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus.
An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks.
arXiv Detail & Related papers (2020-04-14T06:25:36Z) - An Empirical Investigation of Pre-Trained Transformer Language Models
for Open-Domain Dialogue Generation [23.343006562849126]
We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation.
Training paradigm of pre-training and fine-tuning is employed to conduct learning.
Experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat.
arXiv Detail & Related papers (2020-03-09T15:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.