Related papers: StableDrag: Stable Dragging for Point-based Image Editing

StableDrag: Stable Dragging for Point-based Image Editing

URL: http://arxiv.org/abs/2403.04437v1
Date: Thu, 7 Mar 2024 12:11:02 GMT
Title: StableDrag: Stable Dragging for Point-based Image Editing
Authors: Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma and Limin Wang
Abstract summary: Point-based image editing has attracted remarkable attention since the emergence of DragGAN. Recently, DragDiffusion further pushes forward the generative quality via adapting this dragging technique to diffusion models. We build a stable and precise drag-based editing framework, coined as StableDrag, by designing a discirminative point tracking method and a confidence-based latent enhancement strategy for motion supervision.
Score: 24.924112878074336
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Point-based image editing has attracted remarkable attention since the emergence of DragGAN. Recently, DragDiffusion further pushes forward the generative quality via adapting this dragging technique to diffusion models. Despite these great success, this dragging scheme exhibits two major drawbacks, namely inaccurate point tracking and incomplete motion supervision, which may result in unsatisfactory dragging outcomes. To tackle these issues, we build a stable and precise drag-based editing framework, coined as StableDrag, by designing a discirminative point tracking method and a confidence-based latent enhancement strategy for motion supervision. The former allows us to precisely locate the updated handle points, thereby boosting the stability of long-range manipulation, while the latter is responsible for guaranteeing the optimized latent as high-quality as possible across all the manipulation steps. Thanks to these unique designs, we instantiate two types of image editing models including StableDrag-GAN and StableDrag-Diff, which attains more stable dragging performance, through extensive qualitative experiments and quantitative assessment on DragBench.

Related papers

TDEdit: A Unified Diffusion Framework for Text-Drag Guided Image Manipulation [51.72432192816058]
We propose a unified diffusion-based framework for joint drag-text image editing.<n>Our framework introduces two key innovations: (1) Point-Cloud Deterministic Drag, which enhances latent-space layout control through 3D feature mapping, and (2) Drag-Text Guided Denoising, dynamically balancing the influence of drag and text conditions during denoising.
arXiv Detail & Related papers (2025-09-26T05:39:03Z)
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence [31.686266704795273]
We introduce LazyDrag, the first drag-based image editing method for Multi-Modal Diffusion Transformers.<n>LazyDrag directly eliminates the reliance on implicit point matching.<n>It unifies precise geometric control with text guidance, enabling complex edits that were previously out of reach.
arXiv Detail & Related papers (2025-09-15T17:59:47Z)
Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control [52.87568958372421]
Follow-Your-Shape is a training-free and mask-free framework that supports precise and controllable editing of object shapes.<n>We compute a Trajectory Divergence Map (TDM) by comparing token-wise velocity differences between the inversion and denoising paths.<n>Our method achieves superior editability and visual fidelity, particularly in tasks requiring large-scale shape replacement.
arXiv Detail & Related papers (2025-08-11T16:10:00Z)
FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields [20.793887576117527]
We propose FlowDrag, which leverages geometric information for more accurate and coherent transformations.<n>Our approach constructs a 3D mesh from the image, using an energy function to guide mesh deformation based on user-defined drag points.<n>The resulting mesh displacements are projected into 2D and incorporated into a UNet denoising process, enabling precise handle-to-target point alignment.
arXiv Detail & Related papers (2025-07-11T03:18:52Z)
Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting [55.14822004410817]
We introduce DYG, an effective 3D drag-based editing method for 3D Gaussian Splatting. It enables precise control over the extent of editing through the input of 3D masks and pairs of control points. DYG integrates the strengths of the implicit triplane representation to establish the geometric scaffold of the editing results.
arXiv Detail & Related papers (2025-01-30T18:51:54Z)
Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing. Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT) We propose an automatic method to identify "vital layers" within DiT, crucial for image formation. Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z)
Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner [8.310002338000954]
Current methods typically model this problem as automatically learning "how to drag" through point dragging. We propose LucidDrag, which shifts the focus from "how to drag" to "what-then-how" paradigm.
arXiv Detail & Related papers (2024-06-01T13:10:43Z)
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion [94.66090422753126]
MotionFollower is a lightweight score-guided diffusion model for video motion editing. It delivers superior motion editing performance and exclusively supports large camera movements and actions. Compared with MotionEditor, the most advanced motion editing model, MotionFollower achieves an approximately 80% reduction in GPU memory.
arXiv Detail & Related papers (2024-05-30T17:57:30Z)
LightningDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos [101.59710862476041]
We present LightningDrag, a rapid approach enabling high quality drag-based image editing in 1 second. Unlike most previous methods, we redefine drag-based editing as a conditional generation task. Our approach can significantly outperform previous methods in terms of accuracy and consistency.
arXiv Detail & Related papers (2024-05-22T15:14:00Z)
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting. Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z)
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models [31.708968272342315]
We introduce GoodDrag, a novel approach to improve the stability and image quality of drag editing. GoodDrag introduces an AlDD framework that alternates between drag and denoising operations within the diffusion process. We also propose an information-preserving motion supervision operation that maintains the original features of the starting point for precise manipulation and artifact reduction.
arXiv Detail & Related papers (2024-04-10T17:59:59Z)
FreeDrag: Feature Dragging for Reliable Point-based Image Editing [16.833998026980087]
We propose FreeDrag, a feature dragging methodology designed to free the burden on point tracking. The FreeDrag incorporates two key designs, i.e., template feature via adaptive updating and line search with backtracking. Our approach significantly outperforms pre-existing methodologies, offering reliable point-based editing even in various complex scenarios.
arXiv Detail & Related papers (2023-07-10T16:37:46Z)
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing [94.24479528298252]
DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision. By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images. We present a challenging benchmark dataset called DragBench to evaluate the performance of interactive point-based image editing methods.
arXiv Detail & Related papers (2023-06-26T06:04:09Z)
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold [79.94300820221996]
DragGAN is a new way of controlling generative adversarial networks (GANs) DragGAN allows anyone to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc. Both qualitative and quantitative comparisons demonstrate the advantage of DragGAN over prior approaches in the tasks of image manipulation and point tracking.
arXiv Detail & Related papers (2023-05-18T13:41:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.