On Conditioning the Input Noise for Controlled Image Generation with
Diffusion Models
- URL: http://arxiv.org/abs/2205.03859v1
- Date: Sun, 8 May 2022 13:18:14 GMT
- Title: On Conditioning the Input Noise for Controlled Image Generation with
Diffusion Models
- Authors: Vedant Singh, Surgan Jandial, Ayush Chopra, Siddharth Ramesh, Balaji
Krishnamurthy, Vineeth N. Balasubramanian
- Abstract summary: Conditional image generation has paved the way for several breakthroughs in image editing, generating stock photos and 3-D object generation.
In this work, we explore techniques to condition diffusion models with carefully crafted input noise artifacts.
- Score: 27.472482893004862
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conditional image generation has paved the way for several breakthroughs in
image editing, generating stock photos and 3-D object generation. This
continues to be a significant area of interest with the rise of new
state-of-the-art methods that are based on diffusion models. However, diffusion
models provide very little control over the generated image, which led to
subsequent works exploring techniques like classifier guidance, that provides a
way to trade off diversity with fidelity. In this work, we explore techniques
to condition diffusion models with carefully crafted input noise artifacts.
This allows generation of images conditioned on semantic attributes. This is
different from existing approaches that input Gaussian noise and further
introduce conditioning at the diffusion model's inference step. Our experiments
over several examples and conditional settings show the potential of our
approach.
Related papers
- Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method [60.88467353578118]
We show that a fixed-point-inspired iterative approach to invert real-world images does not achieve convergence, instead oscillating between distinct clusters.
We introduce a simple and fast distribution transfer technique that facilitates image enhancement, stroke-based recoloring, as well as visual prompt-guided image editing.
arXiv Detail & Related papers (2024-11-17T17:45:37Z) - Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas [33.334956022229846]
We propose the Merge-Attend-Diffuse operator, which can be plugged into different types of pretrained diffusion models used in a joint diffusion setting.
Specifically, we merge the diffusion paths, reprogramming self- and cross-attention to operate on the aggregated latent space.
Our method maintains compatibility with the input prompt and visual quality of the generated images while increasing their semantic coherence.
arXiv Detail & Related papers (2024-08-28T09:22:32Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion [7.06521373423708]
Denosing diffusion model, as a generative model, has received a lot of attention in the field of image generation.
We introduce diffusion model to the image fusion field, treating the image fusion task as image-to-image translation.
Our method can inspire other works and gain insight into this field to better apply the diffusion model to image fusion tasks.
arXiv Detail & Related papers (2023-04-10T12:28:27Z) - ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion
Trajectories [144.03939123870416]
We propose a novel conditional diffusion model by introducing conditions into the forward process.
We use extra latent space to allocate an exclusive diffusion trajectory for each condition based on some shifting rules.
We formulate our method, which we call textbfShiftDDPMs, and provide a unified point of view on existing related methods.
arXiv Detail & Related papers (2023-02-05T12:48:21Z) - Image Embedding for Denoising Generative Models [0.0]
We focus on Denoising Diffusion Implicit Models due to the deterministic nature of their reverse diffusion process.
As a side result of our investigation, we gain a deeper insight into the structure of the latent space of diffusion models.
arXiv Detail & Related papers (2022-12-30T17:56:07Z) - DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models [23.70476220346754]
We propose a novel guidance approach for diffusion models that uses estimated depth information derived from the rich intermediate representations of diffusion models.
Experiments and extensive ablation studies demonstrate the effectiveness of our method in guiding the diffusion models toward geometrically plausible image generation.
arXiv Detail & Related papers (2022-12-17T12:47:19Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - Diffusion Models in Vision: A Survey [80.82832715884597]
A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage.
Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens.
arXiv Detail & Related papers (2022-09-10T22:00:30Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.