Related papers: Effective Real Image Editing with Accelerated Iterative Diffusion Inversion

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion

URL: http://arxiv.org/abs/2309.04907v1
Date: Sun, 10 Sep 2023 01:23:05 GMT
Title: Effective Real Image Editing with Accelerated Iterative Diffusion Inversion
Authors: Zhihong Pan, Riccardo Gherardi, Xiufeng Xie, Stephen Huang
Abstract summary: It is still challenging to edit and manipulate natural images with modern generative models. Existing approaches that have tackled the problem of inversion stability often incur in significant trade-offs in computational efficiency. We propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that significantly improves reconstruction accuracy with minimal additional overhead in space and time complexity.
Score: 6.335245465042035
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Despite all recent progress, it is still challenging to edit and manipulate natural images with modern generative models. When using Generative Adversarial Network (GAN), one major hurdle is in the inversion process mapping a real image to its corresponding noise vector in the latent space, since its necessary to be able to reconstruct an image to edit its contents. Likewise for Denoising Diffusion Implicit Models (DDIM), the linearization assumption in each inversion step makes the whole deterministic inversion process unreliable. Existing approaches that have tackled the problem of inversion stability often incur in significant trade-offs in computational efficiency. In this work we propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that significantly improves reconstruction accuracy with minimal additional overhead in space and time complexity. By using a novel blended guidance technique, we show that effective results can be obtained on a large range of image editing tasks without large classifier-free guidance in inversion. Furthermore, when compared with other diffusion inversion based works, our proposed process is shown to be more robust for fast image editing in the 10 and 20 diffusion steps' regimes.

Related papers

DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing [73.12011187146481]
Inversion within Diffusion models aims to recover the latent noise representation for a real or generated image.<n>Most inversion approaches suffer from an intrinsic trade-off between reconstruction accuracy and editing flexibility.<n>We introduce Dual-Conditional Inversion (DCI), a novel framework that jointly conditions on the source prompt and reference image.
arXiv Detail & Related papers (2025-06-03T07:46:44Z)
MambaStyle: Efficient StyleGAN Inversion for Real Image Editing with State-Space Models [60.110274007388135]
MambaStyle is an efficient single-stage encoder-based approach for GAN inversion and editing.<n>We show that MambaStyle achieves a superior balance among inversion accuracy, editing quality, and computational efficiency.
arXiv Detail & Related papers (2025-05-06T20:03:47Z)
Tight Inversion: Image-Conditioned Inversion for Real Image Editing [47.445919355293896]
We introduce Tight Inversion, an inversion method that utilizes the most possible condition -- the input image itself. This tight condition narrows the distribution of the model's output and enhances both reconstruction and editability.
arXiv Detail & Related papers (2025-02-27T18:51:16Z)
Taming Rectified Flow for Inversion and Editing [57.3742655030493]
Rectified-flow-based diffusion transformers, such as FLUX and OpenSora, have demonstrated exceptional performance in the field of image and video generation. Despite their robust generative capabilities, these models often suffer from inaccurate inversion, which could limit their effectiveness in downstream tasks such as image and video editing. We propose RF-r, a novel training-free sampler that enhances inversion precision by reducing errors in the process of solving rectified flow ODEs.
arXiv Detail & Related papers (2024-11-07T14:29:02Z)
ERDDCI: Exact Reversible Diffusion via Dual-Chain Inversion for High-Quality Image Editing [20.46262679357339]
Diffusion models (DMs) have been successfully applied to real image editing. Recent popular DMs often rely on the assumption of local linearization. ERDDCI uses the new Dual-Chain Inversion (DCI) for joint inference to derive an exact reversible diffusion process.
arXiv Detail & Related papers (2024-10-18T07:52:03Z)
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations [41.87051958934507]
This paper addresses two key tasks: (i) inversion and (ii) editing of a real image using rectified flow models (such as Flux) Our inversion method allows for state-of-the-art performance in zero-shot inversion and editing, outperforming prior works in stroke-to-image synthesis and semantic image editing.
arXiv Detail & Related papers (2024-10-14T17:56:24Z)
Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR) In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks. We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z)
Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps [24.372192691537897]
This work aims to enrich distilled text-to-image diffusion models with the ability to effectively encode real images into their latent space. We introduce invertible Consistency Distillation (iCD), a generalized consistency distillation framework that facilitates both high-quality image synthesis and accurate image encoding in only 3-4 inference steps. We demonstrate that iCD equipped with dynamic guidance may serve as a highly effective tool for zero-shot text-guided image editing, competing with more expensive state-of-the-art alternatives.
arXiv Detail & Related papers (2024-06-20T17:49:11Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
MirrorDiffusion: Stabilizing Diffusion Process in Zero-shot Image Translation by Prompts Redescription and Beyond [57.14128305383768]
We propose a prompt redescription strategy to realize a mirror effect between the source and reconstructed image in the diffusion model (MirrorDiffusion) MirrorDiffusion achieves superior performance over the state-of-the-art methods on zero-shot image translation benchmarks.
arXiv Detail & Related papers (2024-01-06T14:12:16Z)
Iterative Token Evaluation and Refinement for Real-World Super-Resolution [77.74289677520508]
Real-world image super-resolution (RWSR) is a long-standing problem as low-quality (LQ) images often have complex and unidentified degradations. We propose an Iterative Token Evaluation and Refinement framework for RWSR. We show that ITER is easier to train than Generative Adversarial Networks (GANs) and more efficient than continuous diffusion models.
arXiv Detail & Related papers (2023-12-09T17:07:32Z)
EDICT: Exact Diffusion Inversion via Coupled Transformations [13.996171129586731]
Finding an initial noise vector that produces an input image when fed into the diffusion process (known as inversion) is an important problem. We propose Exact Diffusion Inversion via Coupled Transformations (EDICT), an inversion method that draws inspiration from affine coupling layers. EDICT enables mathematically exact inversion of real and model-generated images by maintaining two coupled noise vectors.
arXiv Detail & Related papers (2022-11-22T18:02:49Z)
Denoising Diffusion Restoration Models [110.1244240726802]
Denoising Diffusion Restoration Models (DDRM) is an efficient, unsupervised posterior sampling method. We demonstrate DDRM's versatility on several image datasets for super-resolution, deblurring, inpainting, and colorization.
arXiv Detail & Related papers (2022-01-27T20:19:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.