Visual Chain-of-Thought Diffusion Models
- URL: http://arxiv.org/abs/2303.16187v2
- Date: Wed, 21 Jun 2023 01:27:37 GMT
- Title: Visual Chain-of-Thought Diffusion Models
- Authors: William Harvey and Frank Wood
- Abstract summary: We propose to close the gap between conditional and unconditional models using a two-stage sampling procedure.
Doing so lets us leverage the power of conditional diffusion models on the unconditional generation task, which we show improves FID by 25-50% compared to standard unconditional generation.
- Score: 15.547439887203613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent progress with conditional image diffusion models has been stunning,
and this holds true whether we are speaking about models conditioned on a text
description, a scene layout, or a sketch. Unconditional image diffusion models
are also improving but lag behind, as do diffusion models which are conditioned
on lower-dimensional features like class labels. We propose to close the gap
between conditional and unconditional models using a two-stage sampling
procedure. In the first stage we sample an embedding describing the semantic
content of the image. In the second stage we sample the image conditioned on
this embedding and then discard the embedding. Doing so lets us leverage the
power of conditional diffusion models on the unconditional generation task,
which we show improves FID by 25-50% compared to standard unconditional
generation.
Related papers
- Don't drop your samples! Coherence-aware training benefits Conditional diffusion [17.349357521783062]
Coherence-Aware Diffusion (CAD) is a novel method that integrates coherence in conditional information into diffusion models.
We show that CAD is theoretically sound and empirically effective on various conditional generation tasks.
arXiv Detail & Related papers (2024-05-30T17:57:26Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - Conditional Generation from Unconditional Diffusion Models using
Denoiser Representations [94.04631421741986]
We propose adapting pre-trained unconditional diffusion models to new conditions using the learned internal representations of the denoiser network.
We show that augmenting the Tiny ImageNet training set with synthetic images generated by our approach improves the classification accuracy of ResNet baselines by up to 8%.
arXiv Detail & Related papers (2023-06-02T20:09:57Z) - ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion
Trajectories [144.03939123870416]
We propose a novel conditional diffusion model by introducing conditions into the forward process.
We use extra latent space to allocate an exclusive diffusion trajectory for each condition based on some shifting rules.
We formulate our method, which we call textbfShiftDDPMs, and provide a unified point of view on existing related methods.
arXiv Detail & Related papers (2023-02-05T12:48:21Z) - Bi-Noising Diffusion: Towards Conditional Diffusion Models with
Generative Restoration Priors [64.24948495708337]
We introduce a new method that brings predicted samples to the training data manifold using a pretrained unconditional diffusion model.
We perform comprehensive experiments to demonstrate the effectiveness of our approach on super-resolution, colorization, turbulence removal, and image-deraining tasks.
arXiv Detail & Related papers (2022-12-14T17:26:35Z) - ADIR: Adaptive Diffusion for Image Reconstruction [46.838084286784195]
We propose a conditional sampling scheme that exploits the prior learned by diffusion models.
We then combine it with a novel approach for adapting pretrained diffusion denoising networks to their input.
We show that our proposed adaptive diffusion for image reconstruction' approach achieves a significant improvement in the super-resolution, deblurring, and text-based editing tasks.
arXiv Detail & Related papers (2022-12-06T18:39:58Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - D2C: Diffusion-Denoising Models for Few-shot Conditional Generation [109.68228014811443]
This paper describes Diffusion-Decoding models with Contrastive representations (D2C)
D2C uses a learned diffusion-based prior over latent representations to improve generation and contrastive self-supervised learning to improve representation quality.
On conditional image manipulation, D2C generations are two orders of magnitude faster to produce over StyleGAN2 ones and are preferred by 50% - 60% of the human evaluators in a double-blind study.
arXiv Detail & Related papers (2021-06-12T16:32:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.