Related papers: Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

URL: http://arxiv.org/abs/2410.10792v1
Date: Mon, 14 Oct 2024 17:56:24 GMT
Title: Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu,
Abstract summary: This paper addresses two key tasks: (i) inversion and (ii) editing of a real image using rectified flow models (such as Flux) Our inversion method allows for state-of-the-art performance in zero-shot inversion and editing, outperforming prior works in stroke-to-image synthesis and semantic image editing.
Score: 41.87051958934507
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative models transform random noise into images; their inversion aims to transform images back to structured noise for recovery and editing. This paper addresses two key tasks: (i) inversion and (ii) editing of a real image using stochastic equivalents of rectified flow models (such as Flux). Although Diffusion Models (DMs) have recently dominated the field of generative modeling for images, their inversion presents faithfulness and editability challenges due to nonlinearities in drift and diffusion. Existing state-of-the-art DM inversion approaches rely on training of additional parameters or test-time optimization of latent variables; both are expensive in practice. Rectified Flows (RFs) offer a promising alternative to diffusion models, yet their inversion has been underexplored. We propose RF inversion using dynamic optimal control derived via a linear quadratic regulator. We prove that the resulting vector field is equivalent to a rectified stochastic differential equation. Additionally, we extend our framework to design a stochastic sampler for Flux. Our inversion method allows for state-of-the-art performance in zero-shot inversion and editing, outperforming prior works in stroke-to-image synthesis and semantic image editing, with large-scale human evaluations confirming user preference.

Related papers

Accurate Latent Inversion for Generative Image Steganography via Rectified Flow [5.404219831398271]
Steganography based on diffusion models has attracted increasing attention due to its ability to generate high-quality images and exhibit strong robustness.<n>We propose textbfRF-Stego, a novel generative image steganography method that enables accurate latent inversion and significantly improves extraction accuracy.<n> Experimental results show RF-Stego outperforms state-of-the-art methods in terms of extraction accuracy, image quality, robustness, security and generation efficiency.
arXiv Detail & Related papers (2025-08-01T08:46:32Z)
Acoustic Waveform Inversion with Image-to-Image Schrödinger Bridges [0.0]
We introduce a conditional Image-to-Image Schr"odinger Bridge (c$textI2textSB$) framework to generate high-resolution samples.<n>Our experiments demonstrate that the proposed solution outperforms our reimplementation of conditional diffusion model.
arXiv Detail & Related papers (2025-06-18T10:55:26Z)
Solving Inverse Problems with FLAIR [59.02385492199431]
Flow-based latent generative models are able to generate images with remarkable quality, even enabling text-to-image generation.<n>We present FLAIR, a novel training free variational framework that leverages flow-based generative models as a prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z)
An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models [13.00429687431982]
Diffusion bridge models initialize the generative process from corrupted images instead of pure Gaussian noise. Existing diffusion bridge models often rely on Differential Equation samplers, which result in slower inference speed. We propose a high-order ODE sampler with a start for diffusion bridge models. Our method is fully compatible with pretrained diffusion bridge models and requires no additional training.
arXiv Detail & Related papers (2024-12-28T03:32:26Z)
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation [53.965218831845995]
Diffusion models (DMs) excel in photorealism, image editing, and solving inverse problems, aided by classifier-free guidance and image inversion techniques. Existing DM-based methods often require additional training, lack generalization to pretrained latent models, underperform, and demand significant computational resources due to extensive backpropagation through ODE solvers and inversion processes. We propose FlowChef, which leverages the vector field to steer the denoising trajectory for controlled image generation tasks, facilitated by gradient skipping. FlowChef significantly outperforms baselines in terms of performance, memory, and time requirements, achieving new state-of-the
arXiv Detail & Related papers (2024-11-27T19:04:40Z)
Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method [60.88467353578118]
We show that a fixed-point-inspired iterative approach to invert real-world images does not achieve convergence, instead oscillating between distinct clusters. We introduce a simple and fast distribution transfer technique that facilitates image enhancement, stroke-based recoloring, as well as visual prompt-guided image editing.
arXiv Detail & Related papers (2024-11-17T17:45:37Z)
Sequential Posterior Sampling with Diffusion Models [15.028061496012924]
We propose a novel approach that models the transition dynamics to improve the efficiency of sequential diffusion posterior sampling in conditional image synthesis. We demonstrate the effectiveness of our approach on a real-world dataset of high frame rate cardiac ultrasound images. Our method opens up new possibilities for real-time applications of diffusion models in imaging and other domains requiring real-time inference.
arXiv Detail & Related papers (2024-09-09T07:55:59Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
Effective Real Image Editing with Accelerated Iterative Diffusion Inversion [6.335245465042035]
It is still challenging to edit and manipulate natural images with modern generative models. Existing approaches that have tackled the problem of inversion stability often incur in significant trade-offs in computational efficiency. We propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that significantly improves reconstruction accuracy with minimal additional overhead in space and time complexity.
arXiv Detail & Related papers (2023-09-10T01:23:05Z)
Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance. We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
A Variational Perspective on Solving Inverse Problems with Diffusion Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data. This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z)
Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data. Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z)
EDICT: Exact Diffusion Inversion via Coupled Transformations [13.996171129586731]
Finding an initial noise vector that produces an input image when fed into the diffusion process (known as inversion) is an important problem. We propose Exact Diffusion Inversion via Coupled Transformations (EDICT), an inversion method that draws inspiration from affine coupling layers. EDICT enables mathematically exact inversion of real and model-generated images by maintaining two coupled noise vectors.
arXiv Detail & Related papers (2022-11-22T18:02:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.