GENIE: Large Scale Pre-training for Text Generation with Diffusion Model
- URL: http://arxiv.org/abs/2212.11685v1
- Date: Thu, 22 Dec 2022 13:17:11 GMT
- Title: GENIE: Large Scale Pre-training for Text Generation with Diffusion Model
- Authors: Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin,
Weizhu Chen, Nan Duan
- Abstract summary: GENIE is a sequence-to-sequence text generation model which combines Transformer and diffusion.
We propose a novel pre-training method named continuous paragraph denoise based on the characteristics of the diffusion model.
- Score: 86.2022500090247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a large-scale language pre-training for text
GENeration using dIffusion modEl, which is named GENIE. GENIE is a pre-training
sequence-to-sequence text generation model which combines Transformer and
diffusion. The diffusion model accepts the latent information from the encoder,
which is used to guide the denoising of the current time step. After multiple
such denoise iterations, the diffusion model can restore the Gaussian noise to
the diverse output text which is controlled by the input text. Moreover, such
architecture design also allows us to adopt large scale pre-training on the
GENIE. We propose a novel pre-training method named continuous paragraph
denoise based on the characteristics of the diffusion model. Extensive
experiments on the XSum, CNN/DailyMail, and Gigaword benchmarks shows that
GENIE can achieves comparable performance with various strong baselines,
especially after pre-training, the generation quality of GENIE is greatly
improved. We have also conduct a lot of experiments on the generation diversity
and parameter impact of GENIE. The code for GENIE will be made publicly
available.
Related papers
- DIAGen: Diverse Image Augmentation with Generative Models [9.79392997282545]
We propose DIAGen to enhance semantic diversity in computer vision models.
We exploit the general knowledge of a text-to-text generative model to guide the image generation of the diffusion model.
Results show that DIAGen not only enhances semantic diversity but also improves the performance of subsequent classifiers.
arXiv Detail & Related papers (2024-08-26T19:09:13Z) - Enforcing Paraphrase Generation via Controllable Latent Diffusion [60.82512050963046]
We propose textitLatent textitDiffusion textitParaphraser(LDP), a novel paraphrase generation by modeling a controllable diffusion process.
Experiments show that LDP achieves improved and diverse paraphrase generation compared to baselines.
arXiv Detail & Related papers (2024-04-13T09:24:32Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Text Generation with Deep Variational GAN [16.3190206770276]
We propose a GAN-based generic framework to address the problem of mode-collapse in a principled approach.
We show that our model can generate realistic text with high diversity.
arXiv Detail & Related papers (2021-04-27T21:42:13Z) - Topical Language Generation using Transformers [4.795530213347874]
This paper presents a novel approach for Topical Language Generation (TLG) by combining a pre-trained LM with topic modeling information.
We extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text.
Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.
arXiv Detail & Related papers (2021-03-11T03:45:24Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework
for Natural Language Generation [44.21363470798758]
ERNIE-GEN is an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework.
It bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method.
It trains the model to predict semantically-complete spans consecutively rather than predicting word by word.
arXiv Detail & Related papers (2020-01-26T02:54:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.