FreeDrag: Feature Dragging for Reliable Point-based Image Editing
- URL: http://arxiv.org/abs/2307.04684v4
- Date: Sat, 3 Aug 2024 15:49:01 GMT
- Title: FreeDrag: Feature Dragging for Reliable Point-based Image Editing
- Authors: Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, Jinjin Zheng,
- Abstract summary: We propose FreeDrag, a feature dragging methodology designed to free the burden on point tracking.
The FreeDrag incorporates two key designs, i.e., template feature via adaptive updating and line search with backtracking.
Our approach significantly outperforms pre-existing methodologies, offering reliable point-based editing even in various complex scenarios.
- Score: 16.833998026980087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To serve the intricate and varied demands of image editing, precise and flexible manipulation in image content is indispensable. Recently, Drag-based editing methods have gained impressive performance. However, these methods predominantly center on point dragging, resulting in two noteworthy drawbacks, namely "miss tracking", where difficulties arise in accurately tracking the predetermined handle points, and "ambiguous tracking", where tracked points are potentially positioned in wrong regions that closely resemble the handle points. To address the above issues, we propose FreeDrag, a feature dragging methodology designed to free the burden on point tracking. The FreeDrag incorporates two key designs, i.e., template feature via adaptive updating and line search with backtracking, the former improves the stability against drastic content change by elaborately controls feature updating scale after each dragging, while the latter alleviates the misguidance from similar points by actively restricting the search area in a line. These two technologies together contribute to a more stable semantic dragging with higher efficiency. Comprehensive experimental results substantiate that our approach significantly outperforms pre-existing methodologies, offering reliable point-based editing even in various complex scenarios.
Related papers
- FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields [20.793887576117527]
We propose FlowDrag, which leverages geometric information for more accurate and coherent transformations.<n>Our approach constructs a 3D mesh from the image, using an energy function to guide mesh deformation based on user-defined drag points.<n>The resulting mesh displacements are projected into 2D and incorporated into a UNet denoising process, enabling precise handle-to-target point alignment.
arXiv Detail & Related papers (2025-07-11T03:18:52Z) - Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting [55.14822004410817]
We introduce DYG, an effective 3D drag-based editing method for 3D Gaussian Splatting.
It enables precise control over the extent of editing through the input of 3D masks and pairs of control points.
DYG integrates the strengths of the implicit triplane representation to establish the geometric scaffold of the editing results.
arXiv Detail & Related papers (2025-01-30T18:51:54Z) - Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency [58.719310295870024]
This paper presents an event-based framework for tracking any point.
It tackles the challenges posed by spatial sparsity and motion sensitivity in events.
It achieves 150% faster processing with competitive model parameters.
arXiv Detail & Related papers (2024-12-02T09:13:29Z) - DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild [85.03973683867797]
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild.
We show that the proposed method achieves state-of-the-art performance in terms of camera pose estimation even in complex dynamic challenge scenes.
arXiv Detail & Related papers (2024-11-20T13:01:16Z) - AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing [14.543341303789445]
We propose a novel mask-free point-based image editing method, AdaptiveDrag, which generates images that better align with user intent.
To ensure a comprehensive connection between the input image and the drag process, we have developed a semantic-driven optimization.
Building on these effective designs, our method delivers superior generation results using only the single input image and the handle-target point pairs.
arXiv Detail & Related papers (2024-10-16T15:59:02Z) - Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner [8.310002338000954]
Current methods typically model this problem as automatically learning "how to drag" through point dragging.
We propose LucidDrag, which shifts the focus from "how to drag" to "what-then-how" paradigm.
arXiv Detail & Related papers (2024-06-01T13:10:43Z) - GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models [31.708968272342315]
We introduce GoodDrag, a novel approach to improve the stability and image quality of drag editing.
GoodDrag introduces an AlDD framework that alternates between drag and denoising operations within the diffusion process.
We also propose an information-preserving motion supervision operation that maintains the original features of the starting point for precise manipulation and artifact reduction.
arXiv Detail & Related papers (2024-04-10T17:59:59Z) - StableDrag: Stable Dragging for Point-based Image Editing [24.924112878074336]
Point-based image editing has attracted remarkable attention since the emergence of DragGAN.
Recently, DragDiffusion further pushes forward the generative quality via adapting this dragging technique to diffusion models.
We build a stable and precise drag-based editing framework, coined as StableDrag, by designing a discirminative point tracking method and a confidence-based latent enhancement strategy for motion supervision.
arXiv Detail & Related papers (2024-03-07T12:11:02Z) - DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing [94.24479528298252]
DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision.
By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images.
We present a challenging benchmark dataset called DragBench to evaluate the performance of interactive point-based image editing methods.
arXiv Detail & Related papers (2023-06-26T06:04:09Z) - Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold [79.94300820221996]
DragGAN is a new way of controlling generative adversarial networks (GANs)
DragGAN allows anyone to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc.
Both qualitative and quantitative comparisons demonstrate the advantage of DragGAN over prior approaches in the tasks of image manipulation and point tracking.
arXiv Detail & Related papers (2023-05-18T13:41:25Z) - Video Annotation for Visual Tracking via Selection and Refinement [74.08109740917122]
We present a new framework to facilitate bounding box annotations for video sequences.
A temporal assessment network is proposed which is able to capture the temporal coherence of target locations.
A visual-geometry refinement network is also designed to further enhance the selected tracking results.
arXiv Detail & Related papers (2021-08-09T05:56:47Z) - SOLD2: Self-supervised Occlusion-aware Line Description and Detection [95.8719432775724]
We introduce the first joint detection and description of line segments in a single deep network.
Our method does not require any annotated line labels and can therefore generalize to any dataset.
We evaluate our approach against previous line detection and description methods on several multi-view datasets.
arXiv Detail & Related papers (2021-04-07T19:27:17Z) - Tracklets Predicting Based Adaptive Graph Tracking [51.352829280902114]
We present an accurate and end-to-end learning framework for multi-object tracking, namely textbfTPAGT.
It re-extracts the features of the tracklets in the current frame based on motion predicting, which is the key to solve the problem of features inconsistent.
arXiv Detail & Related papers (2020-10-18T16:16:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.