Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations
- URL: http://arxiv.org/abs/2405.19201v2
- Date: Wed, 10 Jul 2024 14:42:18 GMT
- Title: Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations
- Authors: Justin Deschenaux, Igor Krawczuk, Grigorios Chrysos, Volkan Cevher,
- Abstract summary: Denoising Diffusion Probabilistic Models (DDPMs) exhibit remarkable capabilities in image generation.
We study DDPMs trained on strictly separate subsets of the data distribution with large gaps on the support of latent factors.
We show that such a model can effectively generate images in the unexplored, intermediate regions of the distribution.
- Score: 54.95457207525101
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Denoising Diffusion Probabilistic Models (DDPMs) exhibit remarkable capabilities in image generation, with studies suggesting that they can generalize by composing latent factors learned from the training data. In this work, we go further and study DDPMs trained on strictly separate subsets of the data distribution with large gaps on the support of the latent factors. We show that such a model can effectively generate images in the unexplored, intermediate regions of the distribution. For instance, when trained on clearly smiling and non-smiling faces, we demonstrate a sampling procedure which can generate slightly smiling faces without reference images (zero-shot interpolation). We replicate these findings for other attributes as well as other datasets. Our code is available at https://github.com/jdeschena/ddpm-zero-shot-interpolation.
Related papers
- Score Neural Operator: A Generative Model for Learning and Generalizing Across Multiple Probability Distributions [7.851040662069365]
We introduce the $emphScore Neural Operator, which learns the mapping from multiple probability distributions to their score functions within a unified framework.
Our approach offers significant potential for few-shot learning applications, where a single image from a new distribution can be leveraged to generate multiple distinct images from that distribution.
arXiv Detail & Related papers (2024-10-11T06:00:34Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - The Journey, Not the Destination: How Data Guides Diffusion Models [75.19694584942623]
Diffusion models trained on large datasets can synthesize photo-realistic images of remarkable quality and diversity.
We propose a framework that: (i) provides a formal notion of data attribution in the context of diffusion models, and (ii) allows us to counterfactually validate such attributions.
arXiv Detail & Related papers (2023-12-11T08:39:43Z) - Improving Denoising Diffusion Probabilistic Models via Exploiting Shared
Representations [5.517338199249029]
SR-DDPM is a class of generative models that produce high-quality images by reversing a noisy diffusion process.
By exploiting the similarity between diverse data distributions, our method can scale to multiple tasks without compromising the image quality.
We evaluate our method on standard image datasets and show that it outperforms both unconditional and conditional DDPM in terms of FID and SSIM metrics.
arXiv Detail & Related papers (2023-11-27T22:30:26Z) - Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes.
For many applications such as image editing, the model input comes from a distribution that is not random noise.
In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z) - UDPM: Upsampling Diffusion Probabilistic Models [33.51145642279836]
Denoising Diffusion Probabilistic Models (DDPM) have recently gained significant attention.
DDPMs generate high-quality samples from complex data distributions by defining an inverse process.
Unlike generative adversarial networks (GANs), the latent space of diffusion models is less interpretable.
In this work, we propose to generalize the denoising diffusion process into an Upsampling Diffusion Probabilistic Model (UDPM)
arXiv Detail & Related papers (2023-05-25T17:25:14Z) - Star-Shaped Denoising Diffusion Probabilistic Models [5.167803438665587]
We introduce Star-Shaped DDPM (SSDDPM)
Our implementation is available at https://github.com/andreyokhotin/star-shaped.
arXiv Detail & Related papers (2023-02-10T14:16:21Z) - Unsupervised Representation Learning from Pre-trained Diffusion
Probabilistic Models [83.75414370493289]
Diffusion Probabilistic Models (DPMs) have shown a powerful capacity of generating high-quality image samples.
Diff-AE have been proposed to explore DPMs for representation learning via autoencoding.
We propose textbfPre-trained textbfAutotextbfEncoding (textbfPDAE) to adapt existing pre-trained DPMs to the decoders for image reconstruction.
arXiv Detail & Related papers (2022-12-26T02:37:38Z) - Denoising Diffusion Implicit Models [117.03720513930335]
We present denoising diffusion implicit models (DDIMs) for iterative implicit probabilistic models with the same training procedure as DDPMs.
DDIMs can produce high quality samples $10 times$ to $50 times$ faster in terms of wall-clock time compared to DDPMs.
arXiv Detail & Related papers (2020-10-06T06:15:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.