Related papers: Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

URL: http://arxiv.org/abs/2302.11552v6
Date: Sat, 14 Sep 2024 18:55:50 GMT
Title: Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Authors: Yilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Grathwohl,
Abstract summary: diffusion models have quickly become the prevailing approach to generative modeling in many domains. We propose an energy-based parameterization of diffusion models which enables the use of new compositional operators. We find these samplers lead to notable improvements in compositional generation across a wide set of problems.
Score: 102.64648158034568
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Since their introduction, diffusion models have quickly become the prevailing approach to generative modeling in many domains. They can be interpreted as learning the gradients of a time-varying sequence of log-probability density functions. This interpretation has motivated classifier-based and classifier-free guidance as methods for post-hoc control of diffusion models. In this work, we build upon these ideas using the score-based interpretation of diffusion models, and explore alternative ways to condition, modify, and reuse diffusion models for tasks involving compositional generation and guidance. In particular, we investigate why certain types of composition fail using current techniques and present a number of solutions. We conclude that the sampler (not the model) is responsible for this failure and propose new samplers, inspired by MCMC, which enable successful compositional generation. Further, we propose an energy-based parameterization of diffusion models which enables the use of new compositional operators and more sophisticated, Metropolis-corrected samplers. Intriguingly we find these samplers lead to notable improvements in compositional generation across a wide set of problems such as classifier-guided ImageNet modeling and compositional text-to-image generation.

Related papers

Accelerated Diffusion Models via Speculative Sampling [89.43940130493233]
Speculative sampling is a popular technique for accelerating inference in Large Language Models. We extend speculative sampling to diffusion models, which generate samples via continuous, vector-valued Markov chains. We propose various drafting strategies, including a simple and effective approach that does not require training a draft model.
arXiv Detail & Related papers (2025-01-09T16:50:16Z)
Causal Diffusion Transformers for Generative Modeling [19.919979972882466]
We introduce Causal Diffusion as the autoregressive (AR) counterpart of Diffusion models. CaulFusion is a decoder-only transformer that dual-factorizes data across sequential tokens and diffusion noise levels.
arXiv Detail & Related papers (2024-12-16T18:59:29Z)
Generative Modeling with Diffusion [0.0]
We introduce the diffusion model as a method to generate new samples. We will define the noising and denoising processes, then introduce algorithms to train and generate with a diffusion model.
arXiv Detail & Related papers (2024-12-14T20:04:46Z)
Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step. Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z)
Aggregation of Multi Diffusion Models for Enhancing Learned Representations [4.126721111013567]
This paper introduces a novel algorithm, Aggregation of Multi Diffusion Models (AMDM) AMDM synthesizes features from multiple diffusion models into a specified model, enhancing its learned representations to activate specific features for fine-grained control. Experimental results demonstrate that AMDM significantly improves fine-grained control without additional training or inference time.
arXiv Detail & Related papers (2024-10-02T06:16:06Z)
Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation. We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z)
Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion. This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement. We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z)
A Reparameterized Discrete Diffusion Model for Text Generation [39.0145272152805]
This work studies discrete diffusion probabilistic models with applications to natural language generation. We derive an alternative yet equivalent formulation of the sampling from discrete diffusion processes. We conduct extensive experiments to evaluate the text generation capability of our model, demonstrating significant improvements over existing diffusion models.
arXiv Detail & Related papers (2023-02-11T16:26:57Z)
A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models. They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.