f-DM: A Multi-stage Diffusion Model via Progressive Signal
Transformation
- URL: http://arxiv.org/abs/2210.04955v1
- Date: Mon, 10 Oct 2022 18:49:25 GMT
- Title: f-DM: A Multi-stage Diffusion Model via Progressive Signal
Transformation
- Authors: Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Miguel Angel Bautista, Josh
Susskind
- Abstract summary: Diffusion models (DMs) have recently emerged as SoTA tools for generative modeling in various domains.
We propose f-DM, a generalized family of DMs which allows progressive signal transformation.
We apply f-DM in image generation tasks with a range of functions, including down-sampling, blurring, and learned transformations.
- Score: 56.04628143914542
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Diffusion models (DMs) have recently emerged as SoTA tools for generative
modeling in various domains. Standard DMs can be viewed as an instantiation of
hierarchical variational autoencoders (VAEs) where the latent variables are
inferred from input-centered Gaussian distributions with fixed scales and
variances. Unlike VAEs, this formulation limits DMs from changing the latent
spaces and learning abstract representations. In this work, we propose f-DM, a
generalized family of DMs which allows progressive signal transformation. More
precisely, we extend DMs to incorporate a set of (hand-designed or learned)
transformations, where the transformed input is the mean of each diffusion
step. We propose a generalized formulation and derive the corresponding
de-noising objective with a modified sampling algorithm. As a demonstration, we
apply f-DM in image generation tasks with a range of functions, including
down-sampling, blurring, and learned transformations based on the encoder of
pretrained VAEs. In addition, we identify the importance of adjusting the noise
levels whenever the signal is sub-sampled and propose a simple rescaling
recipe. f-DM can produce high-quality samples on standard image generation
benchmarks like FFHQ, AFHQ, LSUN, and ImageNet with better efficiency and
semantic interpretation.
Related papers
- Self-Consistent Recursive Diffusion Bridge for Medical Image Translation [6.850683267295248]
Denoising diffusion models (DDMs) have gained recent traction in medical image translation given improved training stability over adversarial models.
We propose a novel self-consistent iterative diffusion bridge (SelfRDB) for improved performance in medical image translation.
Comprehensive analyses in multi-contrast MRI and MRI-CT translation indicate that SelfRDB offers superior performance against competing methods.
arXiv Detail & Related papers (2024-05-10T19:39:55Z) - AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models [103.41269503488546]
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models with user-provided concepts.
This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents.
We propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs.
It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters.
arXiv Detail & Related papers (2023-07-20T09:06:21Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - A Variational Perspective on Solving Inverse Problems with Diffusion
Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data.
This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable.
We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale [36.590918776922905]
This paper proposes a unified diffusion framework (dubbed UniDiffuser) to fit all distributions relevant to a set of multi-modal data in one model.
Inspired by the unified view, UniDiffuser learns all distributions simultaneously with a minimal modification to the original diffusion model.
arXiv Detail & Related papers (2023-03-12T03:38:39Z) - Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion
Probabilistic Models [58.357180353368896]
We propose a conditional paradigm that benefits from the denoising diffusion probabilistic model (DDPM) to tackle the problem of realistic and diverse action-conditioned 3D skeleton-based motion generation.
We are a pioneering attempt that uses DDPM to synthesize a variable number of motion sequences conditioned on a categorical action.
arXiv Detail & Related papers (2023-01-10T13:15:42Z) - Representation Learning with Diffusion Models [0.0]
Diffusion models (DMs) have achieved state-of-the-art results for image synthesis tasks as well as density estimation.
We introduce a framework for learning such representations with diffusion models (LRDM)
In particular, the DM and the representation encoder are trained jointly in order to learn rich representations specific to the generative denoising process.
arXiv Detail & Related papers (2022-10-20T07:26:47Z) - Few-Shot Diffusion Models [15.828257653106537]
We present Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs.
FSDM is trained to adapt the generative process conditioned on a small set of images from a given class by aggregating image patch information.
We empirically show that FSDM can perform few-shot generation and transfer to new datasets.
arXiv Detail & Related papers (2022-05-30T23:20:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.