Related papers: Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise

Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise

URL: http://arxiv.org/abs/2310.17167v1
Date: Thu, 26 Oct 2023 05:43:07 GMT
Title: Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise
Authors: Zhenkai Zhang, Krista A. Ehinger and Tom Drummond
Abstract summary: This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes. The first contribution involves re parameterizing the diffusion process in terms of the angle on a quarter-circular arc between the image and noise. The second contribution is to directly estimate both the image ($mathbfx_0$) and noise ($mathbfepsilon$) using our network.
Score: 15.702941058218196
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes. The first contribution involves reparameterizing the diffusion process in terms of the angle on a quarter-circular arc between the image and noise, specifically setting the conventional $\displaystyle \sqrt{\bar{\alpha}}=\cos(\eta)$. This reparameterization eliminates two singularities and allows for the expression of diffusion evolution as a well-behaved ordinary differential equation (ODE). In turn, this allows higher order ODE solvers such as Runge-Kutta methods to be used effectively. The second contribution is to directly estimate both the image ($\mathbf{x}_0$) and noise ($\mathbf{\epsilon}$) using our network, which enables more stable calculations of the update step in the inverse diffusion steps, as accurate estimation of both the image and noise are crucial at different stages of the process. Together with these changes, our model achieves faster generation, with the ability to converge on high-quality images more quickly, and higher quality of the generated images, as measured by metrics such as Frechet Inception Distance (FID), spatial Frechet Inception Distance (sFID), precision, and recall.

Related papers

Diffusion Models for Solving Inverse Problems via Posterior Sampling with Piecewise Guidance [52.705112811734566]
A novel diffusion-based framework is introduced for solving inverse problems using a piecewise guidance scheme.<n>The proposed method is problem-agnostic and readily adaptable to a variety of inverse problems.<n>The framework achieves a reduction in inference time of (25%) for inpainting with both random and center masks, and (23%) and (24%) for (4times) and (8times) super-resolution tasks.
arXiv Detail & Related papers (2025-07-22T19:35:14Z)
VIPaint: Image Inpainting with Pre-Trained Diffusion Models via Variational Inference [5.852077003870417]
We show that our VIPaint method significantly outperforms previous approaches in both the plausibility and diversity of imputations. We show that our VIPaint method significantly outperforms previous approaches in both the plausibility and diversity of imputations.
arXiv Detail & Related papers (2024-11-28T05:35:36Z)
There and Back Again: On the relation between Noise and Image Inversions in Diffusion Models [3.5707423185282665]
Inversion-based methods map each image back to its approximated starting noise.<n>We show that latents exhibit structural patterns in the form of less diverse noise predicted for smooth image regions.<n>We propose to replace the first DDIM Inversion steps with a forward diffusion process, which successfully decorrelates latent encodings.
arXiv Detail & Related papers (2024-10-31T00:30:35Z)
RDEIC: Accelerating Diffusion-Based Extreme Image Compression with Relay Residual Diffusion [29.277211609920155]
We present Relay Residual Diffusion Extreme Image Compression (RDEIC) We first use the compressed latent features of the image with added noise, instead of pure noise, as the starting point to eliminate the unnecessary initial stages of the denoising process. RDEIC achieves state-of-the-art visual quality and outperforms existing diffusion-based extreme image compression methods in both fidelity and efficiency.
arXiv Detail & Related papers (2024-10-03T16:24:20Z)
Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment [56.609042046176555]
suboptimal noise-data mapping leads to slow training of diffusion models. Drawing inspiration from the immiscibility phenomenon in physics, we propose Immiscible Diffusion. Our approach is remarkably simple, requiring only one line of code to restrict the diffuse-able area for each image.
arXiv Detail & Related papers (2024-06-18T06:20:42Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder [29.924160271522354]
Super-resolution (SR) and image generation are important tasks in computer vision and are widely adopted in real-world applications. Most existing methods, however, generate images only at fixed-scale magnification and suffer from over-smoothing and artifacts. Most relevant work applied Implicit Neural Representation (INR) to the denoising diffusion model to obtain continuous-resolution yet diverse and high-quality SR results. We propose a novel pipeline that can super-resolve an input image or generate from a random noise a novel image at arbitrary scales.
arXiv Detail & Related papers (2024-03-15T12:45:40Z)
Clockwork Diffusion: Efficient Generation With Model-Step Distillation [42.01130983628078]
Clockwork Diffusion is a method that periodically reuses computation from preceding denoising steps to approximate low-res feature maps at one or more subsequent steps. For both text-to-image generation and image editing, we demonstrate that Clockwork leads to comparable or improved perceptual scores with drastically reduced computational complexity.
arXiv Detail & Related papers (2023-12-13T13:30:27Z)
Prompt-tuning latent diffusion models for inverse problems [72.13952857287794]
We propose a new method for solving imaging inverse problems using text-to-image latent diffusion models as general priors. Our method, called P2L, outperforms both image- and latent-diffusion model-based inverse problem solvers on a variety of tasks, such as super-resolution, deblurring, and inpainting.
arXiv Detail & Related papers (2023-10-02T11:31:48Z)
Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation [53.04220377034574]
We propose incorporating an analytical image attenuation process into the forward diffusion process for high-quality (un)conditioned image generation. Our method represents the forward image-to-noise mapping as simultaneous textitimage-to-zero mapping and textitzero-to-noise mapping. We have conducted experiments on unconditioned image generation, textite.g., CIFAR-10 and CelebA-HQ-256, and image-conditioned downstream tasks such as super-resolution, saliency detection, edge detection, and image inpainting.
arXiv Detail & Related papers (2023-06-23T18:08:00Z)
Real-World Denoising via Diffusion Model [14.722529440511446]
Real-world image denoising aims to recover clean images from noisy images captured in natural environments. diffusion models have achieved very promising results in the field of image generation, outperforming previous generation models. This paper proposes a novel general denoising diffusion model that can be used for real-world image denoising.
arXiv Detail & Related papers (2023-05-08T04:48:03Z)
A Variational Perspective on Solving Inverse Problems with Diffusion Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data. This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z)
Representing Noisy Image Without Denoising [91.73819173191076]
Fractional-order Moments in Radon space (FMR) is designed to derive robust representation directly from noisy images. Unlike earlier integer-order methods, our work is a more generic design taking such classical methods as special cases.
arXiv Detail & Related papers (2023-01-18T10:13:29Z)
Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthesis [39.671396431940224]
diffusion models have shown remarkable results in image synthesis by gradually removing noise and amplifying signals. We propose a novel generative process that synthesizes images in a coarse-to-fine manner. Experiments show that the proposed model outperforms the previous method in FID on LSUN bedroom and church datasets.
arXiv Detail & Related papers (2022-07-16T15:00:21Z)
Dynamic Dual-Output Diffusion Models [100.32273175423146]
Iterative denoising-based generation has been shown to be comparable in quality to other classes of generative models. A major drawback of this method is that it requires hundreds of iterations to produce a competitive result. Recent works have proposed solutions that allow for faster generation with fewer iterations, but the image quality gradually deteriorates.
arXiv Detail & Related papers (2022-03-08T11:20:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.