Related papers: High-Fidelity Diffusion-based Image Editing

High-Fidelity Diffusion-based Image Editing

URL: http://arxiv.org/abs/2312.15707v3
Date: Thu, 4 Jan 2024 07:42:19 GMT
Title: High-Fidelity Diffusion-based Image Editing
Authors: Chen Hou, Guoqiang Wei, Zhibo Chen
Abstract summary: The editing performance of diffusion models tends to be no more satisfactory even with increasing denoising steps. We propose an innovative framework where a Markov module is incorporated to modulate diffusion model weights with residual features. We introduce a novel learning paradigm aimed at minimizing error propagation during the editing process, which trains the editing procedure in a manner similar to denoising score-matching.
Score: 19.85446433564999
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have attained remarkable success in the domains of image generation and editing. It is widely recognized that employing larger inversion and denoising steps in diffusion model leads to improved image reconstruction quality. However, the editing performance of diffusion models tends to be no more satisfactory even with increasing denoising steps. The deficiency in editing could be attributed to the conditional Markovian property of the editing process, where errors accumulate throughout denoising steps. To tackle this challenge, we first propose an innovative framework where a rectifier module is incorporated to modulate diffusion model weights with residual features, thereby providing compensatory information to bridge the fidelity gap. Furthermore, we introduce a novel learning paradigm aimed at minimizing error propagation during the editing process, which trains the editing procedure in a manner similar to denoising score-matching. Extensive experiments demonstrate that our proposed framework and training strategy achieve high-fidelity reconstruction and editing results across various levels of denoising steps, meanwhile exhibits exceptional performance in terms of both quantitative metric and qualitative assessments. Moreover, we explore our model's generalization through several applications like image-to-image translation and out-of-domain image editing.

Related papers

Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing [66.48853049746123]
We analyze reconstruction from a structural perspective and propose a novel approach that replaces traditional cross-attention with uniform attention maps. Our method effectively minimizes distortions caused by varying text conditions during noise prediction. Experimental results demonstrate that our approach not only excels in achieving high-fidelity image reconstruction but also performs robustly in real image composition and editing scenarios.
arXiv Detail & Related papers (2024-11-29T12:11:28Z)
Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing. Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT) We propose an automatic method to identify "vital layers" within DiT, crucial for image formation. Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z)
Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing [42.45138713525929]
Effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion. We introduce the Logistic Schedule, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing. Our approach requires no additional retraining and is compatible with various existing editing methods.
arXiv Detail & Related papers (2024-10-24T14:07:02Z)
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing [42.73883397041092]
We propose a novel approach that is built upon a modified diffusion sampling process via the guidance mechanism. In this work, we explore the self-guidance technique to preserve the overall structure of the input image. We show through human evaluation and quantitative analysis that the proposed method allows to produce desired editing.
arXiv Detail & Related papers (2024-09-02T15:21:46Z)
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models [53.757752110493215]
We focus on a popular line of text-based editing frameworks - the edit-friendly'' DDPM-noise inversion approach. We analyze its application to fast sampling methods and categorize its failures into two classes: the appearance of visual artifacts, and insufficient editing strength. We propose a pseudo-guidance approach that efficiently increases the magnitude of edits without introducing new artifacts.
arXiv Detail & Related papers (2024-08-01T17:27:28Z)
Zero-Shot Video Editing through Adaptive Sliding Score Distillation [51.57440923362033]
This study proposes a novel paradigm of video-based score distillation, facilitating direct manipulation of original video content. We propose an Adaptive Sliding Score Distillation strategy, which incorporates both global and local video guidance to reduce the impact of editing errors.
arXiv Detail & Related papers (2024-06-07T12:33:59Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
Diffusion Model-Based Image Editing: A Survey [46.244266782108234]
Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks. We provide an exhaustive overview of existing methods using diffusion models for image editing. To further evaluate the performance of text-guided image editing algorithms, we propose a systematic benchmark, EditEval.
arXiv Detail & Related papers (2024-02-27T14:07:09Z)
Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising [16.43285056788183]
We propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG) Our method leverages a reconstructive denoising network to recover the majority of the underlying clean signal. It employs a diffusion algorithm to generate residual high-frequency details, thereby enhancing visual quality.
arXiv Detail & Related papers (2023-09-19T16:01:20Z)
Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion. This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement. We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z)
Stimulating Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling [56.506240377714754]
We present a novel strategy called the Diffusion Model for Image Denoising (DMID) Our strategy includes an adaptive embedding method that embeds the noisy image into a pre-trained unconditional diffusion model. Our DMID strategy achieves state-of-the-art performance on both distortion-based and perception-based metrics.
arXiv Detail & Related papers (2023-07-08T14:59:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.