A Reparameterized Discrete Diffusion Model for Text Generation
- URL: http://arxiv.org/abs/2302.05737v3
- Date: Fri, 2 Aug 2024 16:09:14 GMT
- Title: A Reparameterized Discrete Diffusion Model for Text Generation
- Authors: Lin Zheng, Jianbo Yuan, Lei Yu, Lingpeng Kong,
- Abstract summary: This work studies discrete diffusion probabilistic models with applications to natural language generation.
We derive an alternative yet equivalent formulation of the sampling from discrete diffusion processes.
We conduct extensive experiments to evaluate the text generation capability of our model, demonstrating significant improvements over existing diffusion models.
- Score: 39.0145272152805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work studies discrete diffusion probabilistic models with applications to natural language generation. We derive an alternative yet equivalent formulation of the sampling from discrete diffusion processes and leverage this insight to develop a family of reparameterized discrete diffusion models. The derived generic framework is highly flexible, offers a fresh perspective of the generation process in discrete diffusion models, and features more effective training and decoding techniques. We conduct extensive experiments to evaluate the text generation capability of our model, demonstrating significant improvements over existing diffusion models.
Related papers
- Continuous Diffusion Model for Language Modeling [57.396578974401734]
Existing continuous diffusion models for discrete data have limited performance compared to discrete approaches.
We propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution.
arXiv Detail & Related papers (2025-02-17T08:54:29Z) - Accelerated Diffusion Models via Speculative Sampling [89.43940130493233]
Speculative sampling is a popular technique for accelerating inference in Large Language Models.
We extend speculative sampling to diffusion models, which generate samples via continuous, vector-valued Markov chains.
We propose various drafting strategies, including a simple and effective approach that does not require training a draft model.
arXiv Detail & Related papers (2025-01-09T16:50:16Z) - An overview of diffusion models for generative artificial intelligence [3.6185342807265415]
This article provides a mathematically rigorous introduction to denoising diffusion probabilistic models (DDPMs)
We provide a detailed basic mathematical framework for DDPMs and explain the main ideas behind training and generation procedures.
arXiv Detail & Related papers (2024-12-02T10:55:38Z) - Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling [2.1779479916071067]
We introduce a novel framework that enhances diffusion models by supporting a broader range of forward processes.
We also propose a novel parameterization technique for learning the forward process.
Results underscore NFDM's versatility and its potential for a wide range of applications.
arXiv Detail & Related papers (2024-04-19T15:10:54Z) - An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology.
Despite the significant empirical success, theory of diffusion models is very limited.
This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z) - Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time [49.598085130313514]
We propose discrete non-Markov diffusion models (DNDM), which naturally induce the predetermined transition time set.
This enables a training-free sampling algorithm that significantly reduces the number of function evaluations.
We study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes.
arXiv Detail & Related papers (2023-12-14T18:14:11Z) - A Survey of Diffusion Models in Natural Language Processing [11.233768932957771]
Diffusion models capture the diffusion of information or signals across a network or manifold.
This paper discusses the different formulations of diffusion models used in NLP, their strengths and limitations, and their applications.
arXiv Detail & Related papers (2023-05-24T03:25:32Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Diffusion Models: A Comprehensive Survey of Methods and Applications [10.557289965753437]
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with dense theoretical founding.
Recent studies have shown great enthusiasm on improving the performance of diffusion model.
arXiv Detail & Related papers (2022-09-02T02:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.