Fast Sampling via Discrete Non-Markov Diffusion Models
- URL: http://arxiv.org/abs/2312.09193v2
- Date: Thu, 27 Jun 2024 04:01:05 GMT
- Title: Fast Sampling via Discrete Non-Markov Diffusion Models
- Authors: Zixiang Chen, Huizhuo Yuan, Yongqian Li, Yiwen Kou, Junkai Zhang, Quanquan Gu,
- Abstract summary: We propose a discrete non-Markov diffusion model, which admits an accelerated reverse sampling for discrete data generation.
Our method significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster.
- Score: 49.598085130313514
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discrete diffusion models have emerged as powerful tools for high-quality data generation. Despite their success in discrete spaces, such as text generation tasks, the acceleration of discrete diffusion models remains under explored. In this paper, we propose a discrete non-Markov diffusion model, which admits an accelerated reverse sampling for discrete data generation. Our method significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster. Furthermore, we study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes for discrete diffusion models. Extensive experiments on natural language generation and machine translation tasks demonstrate the superior performance of our method in terms of both generation speed and sample quality compared to existing methods for discrete diffusion models.
Related papers
- Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Provable Statistical Rates for Consistency Diffusion Models [87.28777947976573]
Despite the state-of-the-art performance, diffusion models are known for their slow sample generation due to the extensive number of steps involved.
This paper contributes towards the first statistical theory for consistency models, formulating their training as a distribution discrepancy minimization problem.
arXiv Detail & Related papers (2024-06-23T20:34:18Z) - New algorithms for sampling and diffusion models [0.0]
We introduce a novel sampling method for known distributions and a new algorithm for diffusion generative models with unknown distributions.
Our approach is inspired by the concept of the reverse diffusion process, widely adopted in diffusion generative models.
arXiv Detail & Related papers (2024-06-14T02:30:04Z) - Convergence Analysis of Discrete Diffusion Model: Exact Implementation
through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points.
Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z) - DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for
Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space.
We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process.
Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z) - Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps.
We introduce a novel approach that tackles the problem by matching implicit and explicit factors.
We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z) - A Reparameterized Discrete Diffusion Model for Text Generation [39.0145272152805]
This work studies discrete diffusion probabilistic models with applications to natural language generation.
We derive an alternative yet equivalent formulation of the sampling from discrete diffusion processes.
We conduct extensive experiments to evaluate the text generation capability of our model, demonstrating significant improvements over existing diffusion models.
arXiv Detail & Related papers (2023-02-11T16:26:57Z) - Fast Inference in Denoising Diffusion Models via MMD Finetuning [23.779985842891705]
We present MMD-DDM, a novel method for fast sampling of diffusion models.
Our approach is based on the idea of using the Maximum Mean Discrepancy (MMD) to finetune the learned distribution with a given budget of timesteps.
Our findings show that the proposed method is able to produce high-quality samples in a fraction of the time required by widely-used diffusion models.
arXiv Detail & Related papers (2023-01-19T09:48:07Z) - Diffusion Glancing Transformer for Parallel Sequence to Sequence
Learning [52.72369034247396]
We propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling.
DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models.
arXiv Detail & Related papers (2022-12-20T13:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.