Optimal Transport for Rectified Flow Image Editing: Unifying Inversion-Based and Direct Methods
- URL: http://arxiv.org/abs/2508.02363v2
- Date: Sat, 20 Sep 2025 11:27:13 GMT
- Title: Optimal Transport for Rectified Flow Image Editing: Unifying Inversion-Based and Direct Methods
- Authors: Marian Lupascu, Mihai-Sorin Stupariu,
- Abstract summary: Transport-based guidance can balance reconstruction accuracy and editing controllability across different rectified flow editing approaches.<n>For inversion-based editing, our method achieves high-fidelity reconstruction with LPIPS scores of 0.001 and SSIM of 0.992 on face editing benchmarks.<n>For inversion-free editing with FlowEdit on FLUX and Stable Diffusion 3, we demonstrate consistent improvements in semantic consistency and structure preservation.
- Score: 0.34376560669160394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image editing in rectified flow models remains challenging due to the fundamental trade-off between reconstruction fidelity and editing flexibility. While inversion-based methods suffer from trajectory deviation, recent inversion-free approaches like FlowEdit offer direct editing pathways but can benefit from additional guidance to improve structure preservation. In this work, we demonstrate that optimal transport theory provides a unified framework for improving both paradigms in rectified flow editing. We introduce a zero-shot transport-guided inversion framework that leverages optimal transport during the reverse diffusion process, and extend optimal transport principles to enhance inversion-free methods through transport-optimized velocity field corrections. Incorporating transport-based guidance can effectively balance reconstruction accuracy and editing controllability across different rectified flow editing approaches. For inversion-based editing, our method achieves high-fidelity reconstruction with LPIPS scores of 0.001 and SSIM of 0.992 on face editing benchmarks, observing 7.8% to 12.9% improvements over RF-Inversion on LSUN datasets. For inversion-free editing with FlowEdit on FLUX and Stable Diffusion 3, we demonstrate consistent improvements in semantic consistency and structure preservation across diverse editing scenarios. Our semantic face editing experiments show an 11.2% improvement in identity preservation and enhanced perceptual quality. The unified optimal transport framework produces visually compelling edits with superior detail preservation across both inversion-based and direct editing paradigms. Code is available for RF-Inversion and FlowEdit at: https://github.com/marianlupascu/OT-RF
Related papers
- Free Lunch for Stabilizing Rectified Flow Inversion [11.80912018629953]
Rectified-Flow (RF)-based generative models have emerged as strong alternatives to traditional diffusion models.<n>We propose Proximal-Mean Inversion (PMI), a training-free gradient correction method.<n>We also introduce mimic-CFG, a lightweight velocity correction scheme for editing tasks.
arXiv Detail & Related papers (2026-02-12T11:42:36Z) - FlowBypass: Rectified Flow Trajectory Bypass for Training-Free Image Editing [10.304374060580828]
Training-free image editing has attracted increasing attention for its efficiency and independence from training data.<n>Previous attempts to address this issue typically employ backbone-specific feature manipulations, limiting general applicability.<n>We propose FlowBypass, a novel and analytical framework grounded in Rectified Flow that constructs a bypass directly connecting inversion and reconstruction trajectories.
arXiv Detail & Related papers (2026-02-02T08:37:00Z) - SNR-Edit: Structure-Aware Noise Rectification for Inversion-Free Flow-Based Editing [18.5465888954825]
Inversion-free image editing using flow-based generative models challenges the prevailing inversion-based pipelines.<n>We introduce SNR-Edit, a training-free framework achieving faithful Latent Trajectory Correction via adaptive noise control.
arXiv Detail & Related papers (2026-01-27T04:24:21Z) - On Exact Editing of Flow-Based Diffusion Models [97.0633397035926]
We propose Conditioned Velocity Correction (CVC) to reformulate flow-based editing as a distribution transformation problem driven by a known source prior.<n>CVC rethinks the role of velocity in inter-distribution transformation by introducing a dual-perspective velocity conversion mechanism.<n>We show that CVC consistently achieves superior fidelity, better semantic alignment, and more reliable editing behavior across diverse tasks.
arXiv Detail & Related papers (2025-12-30T06:29:20Z) - Fine-tuning Done Right in Model Editing [83.79661791576103]
Fine-tuning, a foundational method for adapting large language models, has long been considered ineffective for model editing.<n>We restore fine-tuning to the standard breadth-first (i.e., epoch-based) pipeline with mini-batch optimization.<n>We derive LocFT-BF, a simple and effective localized editing method built on the restored fine-tuning framework.
arXiv Detail & Related papers (2025-09-26T08:53:13Z) - Delta Velocity Rectified Flow for Text-to-Image Editing [15.665085495430313]
We propose Delta Velocity Rectified Flow (DVRF), a novel inversion-free, path-aware editing framework for text-to-image editing.<n> Experimental results indicate that DVRF achieves superior editing quality, fidelity, and controllability while requiring no architectural modifications.
arXiv Detail & Related papers (2025-09-01T21:51:24Z) - Inverse-and-Edit: Effective and Fast Image Editing by Cycle Consistency Models [1.9389881806157316]
In this work, we propose a novel framework that enhances image inversion using consistency models.<n>Our method introduces a cycle-consistency optimization strategy that significantly improves reconstruction accuracy.<n>We achieve state-of-the-art performance across various image editing tasks and datasets.
arXiv Detail & Related papers (2025-06-23T20:34:43Z) - FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing [47.908940130654535]
FlowAlign is an inversion-free flow-based framework for consistent image editing with optimal control-based trajectory control.<n>Our terminal point regularization is shown to balance semantic alignment with the edit prompt and structural consistency with the source image along the trajectory.<n>FlowAlign outperforms existing methods in both source preservation and editing controllability.
arXiv Detail & Related papers (2025-05-29T06:33:16Z) - MambaStyle: Efficient StyleGAN Inversion for Real Image Editing with State-Space Models [60.110274007388135]
MambaStyle is an efficient single-stage encoder-based approach for GAN inversion and editing.<n>We show that MambaStyle achieves a superior balance among inversion accuracy, editing quality, and computational efficiency.
arXiv Detail & Related papers (2025-05-06T20:03:47Z) - Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing [66.48853049746123]
We analyze reconstruction from a structural perspective and propose a novel approach that replaces traditional cross-attention with uniform attention maps.<n>Our method effectively minimizes distortions caused by varying text conditions during noise prediction.<n> Experimental results demonstrate that our approach not only excels in achieving high-fidelity image reconstruction but also performs robustly in real image composition and editing scenarios.
arXiv Detail & Related papers (2024-11-29T12:11:28Z) - Taming Rectified Flow for Inversion and Editing [57.3742655030493]
Rectified-flow-based diffusion transformers like FLUX and OpenSora have demonstrated outstanding performance in the field of image and video generation.<n>Despite their robust generative capabilities, these models often struggle with inaccuracies.<n>We propose RF-r, a training-free sampler that effectively enhances inversion precision by mitigating the errors in the inversion process of rectified flow.
arXiv Detail & Related papers (2024-11-07T14:29:02Z) - PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing [63.38854614997581]
We introduce PostEdit, a method that incorporates a posterior scheme to govern the diffusion sampling process.<n>The proposed PostEdit achieves state-of-the-art editing performance while accurately preserving unedited regions.<n>The method is both inversion- and training-free, necessitating approximately 1.5 seconds and 18 GB of GPU memory to generate high-quality results.
arXiv Detail & Related papers (2024-10-07T09:04:50Z) - Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration [59.744840744491945]
In this paper, we reformulate the trajectory optimization of this kind of method, focusing on enhancing both reconstruction quality and efficiency.<n>To mitigate the considerable computational burden associated with iterative sampling, we propose cost-aware trajectory distillation.<n>We fine-tune a foundational diffusion model (FLUX) with 12B parameters by using our algorithms, producing a unified framework for handling 7 kinds of image restoration tasks.
arXiv Detail & Related papers (2024-10-07T07:46:08Z) - Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing [60.730661748555214]
We introduce textbfTask-textbfOriented textbfDiffusion textbfInversion (textbfTODInv), a novel framework that inverts and edits real images tailored to specific editing tasks.
ToDInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability.
arXiv Detail & Related papers (2024-08-23T22:16:34Z) - Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration [42.01716967725075]
We propose a novel Residual-Conditioned Optimal Transport (RCOT) approach for image restoration.
By duality, the RCOT problem is transformed into a minimax optimization problem, which can be solved by adversarially training neural networks.
arXiv Detail & Related papers (2024-05-05T08:19:04Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Adaptive Image Registration: A Hybrid Approach Integrating Deep Learning
and Optimization Functions for Enhanced Precision [13.242184146186974]
We propose a single framework for image registration based on deep neural networks and optimization.
We show improvements of up to 1.6% in test data, while maintaining the same inference time, and a substantial 1.0% points performance gain in deformation field smoothness.
arXiv Detail & Related papers (2023-11-27T02:48:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.