Related papers: DragScene: Interactive 3D Scene Editing with Single-view Drag Instructions

DragScene: Interactive 3D Scene Editing with Single-view Drag Instructions

URL: http://arxiv.org/abs/2412.13552v1
Date: Wed, 18 Dec 2024 07:02:01 GMT
Title: DragScene: Interactive 3D Scene Editing with Single-view Drag Instructions
Authors: Chenghao Gu, Zhenzhe Li, Zhengqi Zhang, Yunpeng Bai, Shuzhao Xie, Zhi Wang,
Abstract summary: 3D editing has shown remarkable capability in editing scenes based on various instructions.<n>Existing methods struggle with achieving intuitive, localized editing.<n>We introduce DragScene, a framework that integrates drag-style editing with diverse 3D representations.
Score: 9.31257776760014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D editing has shown remarkable capability in editing scenes based on various instructions. However, existing methods struggle with achieving intuitive, localized editing, such as selectively making flowers blossom. Drag-style editing has shown exceptional capability to edit images with direct manipulation instead of ambiguous text commands. Nevertheless, extending drag-based editing to 3D scenes presents substantial challenges due to multi-view inconsistency. To this end, we introduce DragScene, a framework that integrates drag-style editing with diverse 3D representations. First, latent optimization is performed on a reference view to generate 2D edits based on user instructions. Subsequently, coarse 3D clues are reconstructed from the reference view using a point-based representation to capture the geometric details of the edits. The latent representation of the edited view is then mapped to these 3D clues, guiding the latent optimization of other views. This process ensures that edits are propagated seamlessly across multiple views, maintaining multi-view consistency. Finally, the target 3D scene is reconstructed from the edited multi-view images. Extensive experiments demonstrate that DragScene facilitates precise and flexible drag-style editing of 3D scenes, supporting broad applicability across diverse 3D representations.

Related papers

Pro3D-Editor : A Progressive-Views Perspective for Consistent and Precise 3D Editing [25.237699330731395]
Text-guided 3D editing aims to precisely edit semantically relevant local 3D regions.<n>Existing methods typically edit 2D views indiscriminately and projecting them back into 3D space.<n>We argue that ideal consistent 3D editing can be achieved through a textitprogressive-views paradigm
arXiv Detail & Related papers (2025-05-31T11:11:55Z)
3DSwapping: Texture Swapping For 3D Object From Single Reference Image [21.454340647455236]
3D texture swapping allows for the customization of 3D object textures. No dedicated method exists, but adapted 2D editing and text-driven 3D editing approaches can serve this purpose. We introduce 3DSwapping, a 3D texture swapping method that integrates progressive generation, view-consistency gradient guidance, and prompt-tuned gradient guidance.
arXiv Detail & Related papers (2025-03-24T16:31:52Z)
Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting [55.14822004410817]
We introduce DYG, an effective 3D drag-based editing method for 3D Gaussian Splatting. It enables precise control over the extent of editing through the input of 3D masks and pairs of control points. DYG integrates the strengths of the implicit triplane representation to establish the geometric scaffold of the editing results.
arXiv Detail & Related papers (2025-01-30T18:51:54Z)
PrEditor3D: Fast and Precise 3D Shape Editing [100.09112677669376]
We propose a training-free approach to 3D editing that enables the editing of a single shape within a few minutes. The edited 3D mesh aligns well with the prompts, and remains identical for regions that are not intended to be altered.
arXiv Detail & Related papers (2024-12-09T15:44:47Z)
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing [114.14164860467227]
We propose EditRoom, a framework capable of executing a variety of layout edits through natural language commands. Specifically, EditRoom leverages Large Language Models (LLMs) for command planning and generates target scenes. We have developed an automatic pipeline to augment existing 3D scene datasets and introduced EditRoom-DB, a large-scale dataset with 83k editing pairs.
arXiv Detail & Related papers (2024-10-03T17:42:24Z)
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts [76.73043724587679]
We propose a dialogue-based 3D scene editing approach, termed CE3D. Hash-Atlas represents 3D scene views, which transfers the editing of 3D scenes onto 2D atlas images. Results demonstrate that CE3D effectively integrates multiple visual models to achieve diverse editing visual effects.
arXiv Detail & Related papers (2024-07-09T13:24:42Z)
View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.<n>By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z)
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts [119.84478647745658]
TIPEditor is a 3D scene editing framework that accepts both text and image prompts and a 3D bounding box to specify the editing region. Experiments have demonstrated that TIP-Editor conducts accurate editing following the text and image prompts in the specified bounding box region.
arXiv Detail & Related papers (2024-01-26T12:57:05Z)
Free-Editor: Zero-shot Text-driven 3D Scene Editing [8.966537479017951]
Training a diffusion model specifically for 3D scene editing is challenging due to the scarcity of large-scale datasets. We introduce a novel, training-free 3D scene editing technique called textscFree-Editor, which enables users to edit 3D scenes without the need for model retraining. Our method effectively addresses the issue of multi-view style inconsistency found in state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2023-12-21T08:40:57Z)
Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization [21.8454418337306]
We propose Plasticine3D, a novel text-guided controlled 3D editing pipeline that can perform 3D non-rigid editing. Our work divides the editing process into a geometry editing stage and a texture editing stage to achieve separate control of structure and appearance. For the purpose of fine-grained control, we propose Embedding-Fusion (EF) to blend the original characteristics with the editing objectives in the embedding space.
arXiv Detail & Related papers (2023-12-15T09:01:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.