Related papers: Delta Velocity Rectified Flow for Text-to-Image Editing

Delta Velocity Rectified Flow for Text-to-Image Editing

URL: http://arxiv.org/abs/2509.05342v2
Date: Tue, 09 Sep 2025 22:17:54 GMT
Title: Delta Velocity Rectified Flow for Text-to-Image Editing
Authors: Gaspard Beaudouin, Minghan Li, Jaeyeon Kim, Sung-Hoon Yoon, Mengyu Wang,
Abstract summary: We propose Delta Velocity Rectified Flow (DVRF), a novel inversion-free, path-aware editing framework for text-to-image editing.<n> Experimental results indicate that DVRF achieves superior editing quality, fidelity, and controllability while requiring no architectural modifications.
Score: 15.665085495430313
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose Delta Velocity Rectified Flow (DVRF), a novel inversion-free, path-aware editing framework within rectified flow models for text-to-image editing. DVRF is a distillation-based method that explicitly models the discrepancy between the source and target velocity fields in order to mitigate over-smoothing artifacts rampant in prior distillation sampling approaches. We further introduce a time-dependent shift term to push noisy latents closer to the target trajectory, enhancing the alignment with the target distribution. We theoretically demonstrate that when this shift is disabled, DVRF reduces to Delta Denoising Score, thereby bridging score-based diffusion optimization and velocity-based rectified-flow optimization. Moreover, when the shift term follows a linear schedule under rectified-flow dynamics, DVRF generalizes the Inversion-free method FlowEdit and provides a principled theoretical interpretation for it. Experimental results indicate that DVRF achieves superior editing quality, fidelity, and controllability while requiring no architectural modifications, making it efficient and broadly applicable to text-to-image editing tasks. Code is available at https://github.com/Harvard-AI-and-Robotics-Lab/DeltaVelocityRectifiedFlow.

Related papers

Free Lunch for Stabilizing Rectified Flow Inversion [11.80912018629953]
Rectified-Flow (RF)-based generative models have emerged as strong alternatives to traditional diffusion models.<n>We propose Proximal-Mean Inversion (PMI), a training-free gradient correction method.<n>We also introduce mimic-CFG, a lightweight velocity correction scheme for editing tasks.
arXiv Detail & Related papers (2026-02-12T11:42:36Z)
On Exact Editing of Flow-Based Diffusion Models [97.0633397035926]
We propose Conditioned Velocity Correction (CVC) to reformulate flow-based editing as a distribution transformation problem driven by a known source prior.<n>CVC rethinks the role of velocity in inter-distribution transformation by introducing a dual-perspective velocity conversion mechanism.<n>We show that CVC consistently achieves superior fidelity, better semantic alignment, and more reliable editing behavior across diverse tasks.
arXiv Detail & Related papers (2025-12-30T06:29:20Z)
Error-Propagation-Free Learned Video Compression With Dual-Domain Progressive Temporal Alignment [92.57576987521107]
We propose a novel unifiedtransform framework with dual-domain progressive temporal alignment and quality-conditioned mixture-of-expert (QCMoE)<n>QCMoE allows continuous and consistent rate control with appealing R-D performance.<n> Experimental results show that the proposed method achieves competitive R-D performance compared with the state-of-the-arts.
arXiv Detail & Related papers (2025-12-11T09:14:51Z)
NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows [75.70583906344815]
Diffusion models have been widely adopted as action decoders due to their ability to model complex, multimodal action distributions.<n>We present NinA, a fast and expressive alternative to diffusion-based decoders for Vision-Language-Action (VLA) models.
arXiv Detail & Related papers (2025-08-23T00:02:15Z)
Optimal Transport for Rectified Flow Image Editing: Unifying Inversion-Based and Direct Methods [0.34376560669160394]
Transport-based guidance can balance reconstruction accuracy and editing controllability across different rectified flow editing approaches.<n>For inversion-based editing, our method achieves high-fidelity reconstruction with LPIPS scores of 0.001 and SSIM of 0.992 on face editing benchmarks.<n>For inversion-free editing with FlowEdit on FLUX and Stable Diffusion 3, we demonstrate consistent improvements in semantic consistency and structure preservation.
arXiv Detail & Related papers (2025-08-04T12:50:58Z)
FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing [47.908940130654535]
FlowAlign is an inversion-free flow-based framework for consistent image editing with optimal control-based trajectory control.<n>Our terminal point regularization is shown to balance semantic alignment with the edit prompt and structural consistency with the source image along the trajectory.<n>FlowAlign outperforms existing methods in both source preservation and editing controllability.
arXiv Detail & Related papers (2025-05-29T06:33:16Z)
Taming Rectified Flow for Inversion and Editing [57.3742655030493]
Rectified-flow-based diffusion transformers like FLUX and OpenSora have demonstrated outstanding performance in the field of image and video generation.<n>Despite their robust generative capabilities, these models often struggle with inaccuracies.<n>We propose RF-r, a training-free sampler that effectively enhances inversion precision by mitigating the errors in the inversion process of rectified flow.
arXiv Detail & Related papers (2024-11-07T14:29:02Z)
FlowIE: Efficient Image Enhancement via Rectified Flow [71.6345505427213]
FlowIE is a flow-based framework that estimates straight-line paths from an elementary distribution to high-quality images. Our contributions are rigorously validated through comprehensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2024-06-01T17:29:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.