Effective Real Image Editing with Accelerated Iterative Diffusion
Inversion
- URL: http://arxiv.org/abs/2309.04907v1
- Date: Sun, 10 Sep 2023 01:23:05 GMT
- Title: Effective Real Image Editing with Accelerated Iterative Diffusion
Inversion
- Authors: Zhihong Pan, Riccardo Gherardi, Xiufeng Xie, Stephen Huang
- Abstract summary: It is still challenging to edit and manipulate natural images with modern generative models.
Existing approaches that have tackled the problem of inversion stability often incur in significant trade-offs in computational efficiency.
We propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that significantly improves reconstruction accuracy with minimal additional overhead in space and time complexity.
- Score: 6.335245465042035
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite all recent progress, it is still challenging to edit and manipulate
natural images with modern generative models. When using Generative Adversarial
Network (GAN), one major hurdle is in the inversion process mapping a real
image to its corresponding noise vector in the latent space, since its
necessary to be able to reconstruct an image to edit its contents. Likewise for
Denoising Diffusion Implicit Models (DDIM), the linearization assumption in
each inversion step makes the whole deterministic inversion process unreliable.
Existing approaches that have tackled the problem of inversion stability often
incur in significant trade-offs in computational efficiency. In this work we
propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that
significantly improves reconstruction accuracy with minimal additional overhead
in space and time complexity. By using a novel blended guidance technique, we
show that effective results can be obtained on a large range of image editing
tasks without large classifier-free guidance in inversion. Furthermore, when
compared with other diffusion inversion based works, our proposed process is
shown to be more robust for fast image editing in the 10 and 20 diffusion
steps' regimes.
Related papers
- Taming Rectified Flow for Inversion and Editing [57.3742655030493]
Rectified-flow-based diffusion transformers, such as FLUX and OpenSora, have demonstrated exceptional performance in the field of image and video generation.
Despite their robust generative capabilities, these models often suffer from inaccurate inversion, which could limit their effectiveness in downstream tasks such as image and video editing.
We propose RF-r, a novel training-free sampler that enhances inversion precision by reducing errors in the process of solving rectified flow ODEs.
arXiv Detail & Related papers (2024-11-07T14:29:02Z) - ERDDCI: Exact Reversible Diffusion via Dual-Chain Inversion for High-Quality Image Editing [20.46262679357339]
Diffusion models (DMs) have been successfully applied to real image editing.
Recent popular DMs often rely on the assumption of local linearization.
ERDDCI uses the new Dual-Chain Inversion (DCI) for joint inference to derive an exact reversible diffusion process.
arXiv Detail & Related papers (2024-10-18T07:52:03Z) - Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations [41.87051958934507]
This paper addresses two key tasks: (i) inversion and (ii) editing of a real image using rectified flow models (such as Flux)
Our inversion method allows for state-of-the-art performance in zero-shot inversion and editing, outperforming prior works in stroke-to-image synthesis and semantic image editing.
arXiv Detail & Related papers (2024-10-14T17:56:24Z) - Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps [24.372192691537897]
This work aims to enrich distilled text-to-image diffusion models with the ability to effectively encode real images into their latent space.
We introduce invertible Consistency Distillation (iCD), a generalized consistency distillation framework that facilitates both high-quality image synthesis and accurate image encoding in only 3-4 inference steps.
We demonstrate that iCD equipped with dynamic guidance may serve as a highly effective tool for zero-shot text-guided image editing, competing with more expensive state-of-the-art alternatives.
arXiv Detail & Related papers (2024-06-20T17:49:11Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - MirrorDiffusion: Stabilizing Diffusion Process in Zero-shot Image
Translation by Prompts Redescription and Beyond [57.14128305383768]
We propose a prompt redescription strategy to realize a mirror effect between the source and reconstructed image in the diffusion model (MirrorDiffusion)
MirrorDiffusion achieves superior performance over the state-of-the-art methods on zero-shot image translation benchmarks.
arXiv Detail & Related papers (2024-01-06T14:12:16Z) - Iterative Token Evaluation and Refinement for Real-World
Super-Resolution [77.74289677520508]
Real-world image super-resolution (RWSR) is a long-standing problem as low-quality (LQ) images often have complex and unidentified degradations.
We propose an Iterative Token Evaluation and Refinement framework for RWSR.
We show that ITER is easier to train than Generative Adversarial Networks (GANs) and more efficient than continuous diffusion models.
arXiv Detail & Related papers (2023-12-09T17:07:32Z) - EDICT: Exact Diffusion Inversion via Coupled Transformations [13.996171129586731]
Finding an initial noise vector that produces an input image when fed into the diffusion process (known as inversion) is an important problem.
We propose Exact Diffusion Inversion via Coupled Transformations (EDICT), an inversion method that draws inspiration from affine coupling layers.
EDICT enables mathematically exact inversion of real and model-generated images by maintaining two coupled noise vectors.
arXiv Detail & Related papers (2022-11-22T18:02:49Z) - Denoising Diffusion Restoration Models [110.1244240726802]
Denoising Diffusion Restoration Models (DDRM) is an efficient, unsupervised posterior sampling method.
We demonstrate DDRM's versatility on several image datasets for super-resolution, deblurring, inpainting, and colorization.
arXiv Detail & Related papers (2022-01-27T20:19:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.