High-Fidelity Diffusion-based Image Editing
- URL: http://arxiv.org/abs/2312.15707v3
- Date: Thu, 4 Jan 2024 07:42:19 GMT
- Title: High-Fidelity Diffusion-based Image Editing
- Authors: Chen Hou, Guoqiang Wei, Zhibo Chen
- Abstract summary: The editing performance of diffusion models tends to be no more satisfactory even with increasing denoising steps.
We propose an innovative framework where a Markov module is incorporated to modulate diffusion model weights with residual features.
We introduce a novel learning paradigm aimed at minimizing error propagation during the editing process, which trains the editing procedure in a manner similar to denoising score-matching.
- Score: 19.85446433564999
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have attained remarkable success in the domains of image
generation and editing. It is widely recognized that employing larger inversion
and denoising steps in diffusion model leads to improved image reconstruction
quality. However, the editing performance of diffusion models tends to be no
more satisfactory even with increasing denoising steps. The deficiency in
editing could be attributed to the conditional Markovian property of the
editing process, where errors accumulate throughout denoising steps. To tackle
this challenge, we first propose an innovative framework where a rectifier
module is incorporated to modulate diffusion model weights with residual
features, thereby providing compensatory information to bridge the fidelity
gap. Furthermore, we introduce a novel learning paradigm aimed at minimizing
error propagation during the editing process, which trains the editing
procedure in a manner similar to denoising score-matching. Extensive
experiments demonstrate that our proposed framework and training strategy
achieve high-fidelity reconstruction and editing results across various levels
of denoising steps, meanwhile exhibits exceptional performance in terms of both
quantitative metric and qualitative assessments. Moreover, we explore our
model's generalization through several applications like image-to-image
translation and out-of-domain image editing.
Related papers
- Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing [42.45138713525929]
Effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion.
We introduce the Logistic Schedule, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing.
Our approach requires no additional retraining and is compatible with various existing editing methods.
arXiv Detail & Related papers (2024-10-24T14:07:02Z) - Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing [42.73883397041092]
We propose a novel approach that is built upon a modified diffusion sampling process via the guidance mechanism.
In this work, we explore the self-guidance technique to preserve the overall structure of the input image.
We show through human evaluation and quantitative analysis that the proposed method allows to produce desired editing.
arXiv Detail & Related papers (2024-09-02T15:21:46Z) - TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models [53.757752110493215]
We focus on a popular line of text-based editing frameworks - the edit-friendly'' DDPM-noise inversion approach.
We analyze its application to fast sampling methods and categorize its failures into two classes: the appearance of visual artifacts, and insufficient editing strength.
We propose a pseudo-guidance approach that efficiently increases the magnitude of edits without introducing new artifacts.
arXiv Detail & Related papers (2024-08-01T17:27:28Z) - Zero-Shot Video Editing through Adaptive Sliding Score Distillation [51.57440923362033]
This study proposes a novel paradigm of video-based score distillation, facilitating direct manipulation of original video content.
We propose an Adaptive Sliding Score Distillation strategy, which incorporates both global and local video guidance to reduce the impact of editing errors.
arXiv Detail & Related papers (2024-06-07T12:33:59Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - Diffusion Model-Based Image Editing: A Survey [46.244266782108234]
Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks.
We provide an exhaustive overview of existing methods using diffusion models for image editing.
To further evaluate the performance of text-guided image editing algorithms, we propose a systematic benchmark, EditEval.
arXiv Detail & Related papers (2024-02-27T14:07:09Z) - Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image
Denoising [16.43285056788183]
We propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG)
Our method leverages a reconstructive denoising network to recover the majority of the underlying clean signal.
It employs a diffusion algorithm to generate residual high-frequency details, thereby enhancing visual quality.
arXiv Detail & Related papers (2023-09-19T16:01:20Z) - Gradpaint: Gradient-Guided Inpainting with Diffusion Models [71.47496445507862]
Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation.
We present GradPaint, which steers the generation towards a globally coherent image.
We generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods.
arXiv Detail & Related papers (2023-09-18T09:36:24Z) - Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion.
This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement.
We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z) - Stimulating the Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling [56.506240377714754]
We present a novel strategy called the Diffusion Model for Image Denoising (DMID)
Our strategy includes an adaptive embedding method that embeds the noisy image into a pre-trained unconditional diffusion model.
Our DMID strategy achieves state-of-the-art performance on both distortion-based and perception-based metrics.
arXiv Detail & Related papers (2023-07-08T14:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.