Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive
Text Generation
- URL: http://arxiv.org/abs/2305.04044v1
- Date: Sat, 6 May 2023 13:20:31 GMT
- Title: Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive
Text Generation
- Authors: Kun Zhou, Yifan Li, Wayne Xin Zhao and Ji-Rong Wen
- Abstract summary: Diffusion-NAT introduces discrete diffusion models into NAR text-to-text generation and integrates BART to improve the performance.
Experimental results on 7 datasets show that our approach can outperform competitive NAR methods, and even surpass autoregressive methods.
- Score: 94.4634088113513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, continuous diffusion models (CDM) have been introduced into
non-autoregressive (NAR) text-to-text generation. However, the discrete nature
of text increases the difficulty of CDM to generate coherent and fluent texts,
and also causes the incompatibility problem between CDM and advanced NLP
techniques, especially the popular pre-trained language models~(PLMs). To solve
it, we propose Diffusion-NAT, which introduces discrete diffusion models~(DDM)
into NAR text-to-text generation and integrates BART to improve the
performance. By revising the decoding process of BART and the typical settings
of DDM, we unify the inference process of BART and the denoising process of DDM
into the same NAR masked tokens recovering task. In this way, DDM can rely on
BART to perform denoising, which can benefit from both the rich pre-learned
knowledge of BART and the iterative refining paradigm of DDM. Besides, we also
propose the iterative self-prompting strategy to further improve the generation
quality. Experimental results on 7 datasets show that our approach can
outperform competitive NAR methods, and even surpass autoregressive methods.
Our code and data will be publicly released.
Related papers
- MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture [69.58440626023541]
Diffusion models (DMs) have demonstrated exceptional generative capabilities across various areas.
The most common way to accelerate DMs involves reducing the number of denoising steps during generation.
We propose a novel method that transfers the capability of large pretrained DMs to faster architectures.
arXiv Detail & Related papers (2024-09-05T14:12:22Z) - UDPM: Upsampling Diffusion Probabilistic Models [33.51145642279836]
Denoising Diffusion Probabilistic Models (DDPM) have recently gained significant attention.
DDPMs generate high-quality samples from complex data distributions by defining an inverse process.
Unlike generative adversarial networks (GANs), the latent space of diffusion models is less interpretable.
In this work, we propose to generalize the denoising diffusion process into an Upsampling Diffusion Probabilistic Model (UDPM)
arXiv Detail & Related papers (2023-05-25T17:25:14Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Restoration based Generative Models [0.886014926770622]
Denoising diffusion models (DDMs) have attracted increasing attention by showing impressive synthesis quality.
In this paper, we establish the interpretation of DDMs in terms of image restoration (IR)
We propose a multi-scale training, which improves the performance compared to the diffusion process, by taking advantage of the flexibility of the forward process.
We believe that our framework paves the way for designing a new type of flexible general generative model.
arXiv Detail & Related papers (2023-02-20T00:53:33Z) - Boundary Guided Learning-Free Semantic Control with Diffusion Models [44.37803942479853]
We present our BoundaryDiffusion method for efficient, effective and light-weight semantic control with frozen pre-trained DDMs.
We conduct extensive experiments on DPMs architectures (DDPM, iDDPM) and datasets (CelebA, CelebA-HQ, LSUN-church, LSUN-bedroom, AFHQ-dog) with different resolutions (64, 256)
arXiv Detail & Related papers (2023-02-16T15:21:46Z) - DiffusionBERT: Improving Generative Masked Language Models with
Diffusion Models [81.84866217721361]
DiffusionBERT is a new generative masked language model based on discrete diffusion models.
We propose a new noise schedule for the forward diffusion process that controls the degree of noise added at each step.
Experiments on unconditional text generation demonstrate that DiffusionBERT achieves significant improvement over existing diffusion models for text.
arXiv Detail & Related papers (2022-11-28T03:25:49Z) - A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation [135.84684279852098]
Non-Autoregressive (NAR) models significantly under-perform Auto-regressive (AR) models on various language generation tasks.
Among the NAR models, BANG is the first large-scale pre-training model on English un-labeled raw text corpus.
We propose a novel self-paced mixed distillation method to further improve the generation quality of BANG.
arXiv Detail & Related papers (2022-05-23T09:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.