Correcting Diffusion Generation through Resampling
- URL: http://arxiv.org/abs/2312.06038v1
- Date: Sun, 10 Dec 2023 23:35:13 GMT
- Title: Correcting Diffusion Generation through Resampling
- Authors: Yujian Liu, Yang Zhang, Tommi Jaakkola, Shiyu Chang
- Abstract summary: We propose a particle filtering framework that can reduce the distributional discrepancies between generated and ground-truth images.
Our method can effectively correct missing object errors and improve image quality in various image generation tasks.
- Score: 35.983033188407845
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite diffusion models' superior capabilities in modeling complex
distributions, there are still non-trivial distributional discrepancies between
generated and ground-truth images, which has resulted in several notable
problems in image generation, including missing object errors in text-to-image
generation and low image quality. Existing methods that attempt to address
these problems mostly do not tend to address the fundamental cause behind these
problems, which is the distributional discrepancies, and hence achieve
sub-optimal results. In this paper, we propose a particle filtering framework
that can effectively address both problems by explicitly reducing the
distributional discrepancies. Specifically, our method relies on a set of
external guidance, including a small set of real images and a pre-trained
object detector, to gauge the distribution gap, and then design the resampling
weight accordingly to correct the gap. Experiments show that our methods can
effectively correct missing object errors and improve image quality in various
image generation tasks. Notably, our method outperforms the existing strongest
baseline by 5% in object occurrence and 1.0 in FID on MS-COCO. Our code is
publicly available at
https://github.com/UCSB-NLP-Chang/diffusion_resampling.git.
Related papers
- Rethinking Score Distillation as a Bridge Between Image Distributions [97.27476302077545]
We show that our method seeks to transport corrupted images (source) to the natural image distribution (target)
Our method can be easily applied across many domains, matching or beating the performance of specialized methods.
We demonstrate its utility in text-to-2D, text-based NeRF optimization, translating paintings to real images, optical illusion generation, and 3D sketch-to-real.
arXiv Detail & Related papers (2024-06-13T17:59:58Z) - One-Step Effective Diffusion Network for Real-World Image Super-Resolution [11.326598938246558]
We propose a one-step effective diffusion network, namely OSEDiff, for the Real-ISR problem.
We apply variational score distillation in the latent space to conduct KL-divergence regularization.
Our experiments demonstrate that OSEDiff achieves comparable or even better Real-ISR results, in terms of both objective metrics and subjective evaluations.
arXiv Detail & Related papers (2024-06-12T13:10:31Z) - Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder [29.924160271522354]
Super-resolution (SR) and image generation are important tasks in computer vision and are widely adopted in real-world applications.
Most existing methods, however, generate images only at fixed-scale magnification and suffer from over-smoothing and artifacts.
Most relevant work applied Implicit Neural Representation (INR) to the denoising diffusion model to obtain continuous-resolution yet diverse and high-quality SR results.
We propose a novel pipeline that can super-resolve an input image or generate from a random noise a novel image at arbitrary scales.
arXiv Detail & Related papers (2024-03-15T12:45:40Z) - Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction [4.227116189483428]
This study introduces a novel Cascaded Diffusion with Discrepancy Mitigation framework.
It includes the low-quality image generation in latent space and the high-quality image generation in pixel space.
It minimizes computational costs by moving some inference steps from pixel space to latent space.
arXiv Detail & Related papers (2024-03-14T12:58:28Z) - Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes.
For many applications such as image editing, the model input comes from a distribution that is not random noise.
In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z) - Real-World Image Variation by Aligning Diffusion Inversion Chain [53.772004619296794]
A domain gap exists between generated images and real-world images, which poses a challenge in generating high-quality variations of real-world images.
We propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL)
Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
arXiv Detail & Related papers (2023-05-30T04:09:47Z) - SDM: Spatial Diffusion Model for Large Hole Image Inpainting [106.90795513361498]
We present a novel spatial diffusion model (SDM) that uses a few iterations to gradually deliver informative pixels to the entire image.
Also, thanks to the proposed decoupled probabilistic modeling and spatial diffusion scheme, our method achieves high-quality large-hole completion.
arXiv Detail & Related papers (2022-12-06T13:30:18Z) - Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.