DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
- URL: http://arxiv.org/abs/2404.18929v2
- Date: Mon, 22 Jul 2024 14:11:07 GMT
- Title: DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
- Authors: Minghao Chen, Iro Laina, Andrea Vedaldi,
- Abstract summary: We consider the problem of editing 3D objects and scenes based on open-ended language instructions.
A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.
This process is often inefficient due to the need for iterative updates of costly 3D representations.
- Score: 72.54566271694654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of editing 3D objects and scenes based on open-ended language instructions. A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process, obviating the need for 3D data. However, this process is often inefficient due to the need for iterative updates of costly 3D representations, such as neural radiance fields, either through individual view edits or score distillation sampling. A major disadvantage of this approach is the slow convergence caused by aggregating inconsistent information across views, as the guidance from 2D models is not multi-view consistent. We thus introduce the Direct Gaussian Editor (DGE), a method that addresses these issues in two stages. First, we modify a given high-quality image editor like InstructPix2Pix to be multi-view consistent. To do so, we propose a training-free approach that integrates cues from the 3D geometry of the underlying scene. Second, given a multi-view consistent edited sequence of images, we directly and efficiently optimize the 3D representation, which is based on 3D Gaussian Splatting. Because it avoids incremental and iterative edits, DGE is significantly more accurate and efficient than existing approaches and offers additional benefits, such as enabling selective editing of parts of the scene.
Related papers
- 3D Gaussian Editing with A Single Image [19.662680524312027]
We introduce a novel single-image-driven 3D scene editing approach based on 3D Gaussian Splatting.
Our method learns to optimize the 3D Gaussians to align with an edited version of the image rendered from a user-specified viewpoint.
Experiments show the effectiveness of our method in handling geometric details, long-range, and non-rigid deformation.
arXiv Detail & Related papers (2024-08-14T13:17:42Z) - TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing [12.50147114409895]
This paper proposes a systematic approach, namely TIGER, for coherent text-instructed 3D Gaussian retrieval and editing.
To overcome the over-smoothing and inconsistency issues in editing, we propose a Coherent Score Distillation (CSD) that aggregates a 2D image editing diffusion model and a multi-view diffusion model for score distillation.
arXiv Detail & Related papers (2024-05-23T11:37:17Z) - DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting.
Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z) - View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.
By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z) - GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing [38.948892064761914]
GaussCtrl is a text-driven method to edit a 3D scene reconstructed by the 3D Gaussian Splatting (3DGS)
Our key contribution is multi-view consistent editing, which enables editing all images together instead of iteratively editing one image.
arXiv Detail & Related papers (2024-03-13T17:35:28Z) - Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview
Correspondence-Enhanced Diffusion Models [83.97844535389073]
A major obstacle hindering the widespread adoption of 3D content editing is its time-intensive processing.
We propose that by incorporating correspondence regularization into diffusion models, the process of 3D editing can be significantly accelerated.
In most scenarios, our proposed technique brings a 10$times$ speed-up compared to the baseline method and completes the editing of a 3D scene in 2 minutes with comparable quality.
arXiv Detail & Related papers (2023-12-13T23:27:17Z) - Learning Naturally Aggregated Appearance for Efficient 3D Editing [94.47518916521065]
We propose to replace the color field with an explicit 2D appearance aggregation, also called canonical image.
To avoid the distortion effect and facilitate convenient editing, we complement the canonical image with a projection field that maps 3D points onto 2D pixels for texture lookup.
Our representation, dubbed AGAP, well supports various ways of 3D editing (e.g., stylization, interactive drawing, and content extraction) with no need of re-optimization.
arXiv Detail & Related papers (2023-12-11T18:59:31Z) - GaussianEditor: Swift and Controllable 3D Editing with Gaussian
Splatting [66.08674785436612]
3D editing plays a crucial role in many areas such as gaming and virtual reality.
Traditional 3D editing methods, which rely on representations like meshes and point clouds, often fall short in realistically depicting complex scenes.
Our paper presents GaussianEditor, an innovative and efficient 3D editing algorithm based on Gaussian Splatting (GS), a novel 3D representation.
arXiv Detail & Related papers (2023-11-24T14:46:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.