ED-NeRF: Efficient Text-Guided Editing of 3D Scene with Latent Space NeRF
- URL: http://arxiv.org/abs/2310.02712v2
- Date: Thu, 21 Mar 2024 07:20:35 GMT
- Title: ED-NeRF: Efficient Text-Guided Editing of 3D Scene with Latent Space NeRF
- Authors: Jangho Park, Gihyun Kwon, Jong Chul Ye,
- Abstract summary: We present a novel 3D NeRF editing approach dubbed ED-NeRF.
We embed real-world scenes into the latent space of the latent diffusion model (LDM) through a unique refinement layer.
This approach enables us to obtain a NeRF backbone that is not only faster but also more amenable to editing.
- Score: 60.47731445033151
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, there has been a significant advancement in text-to-image diffusion models, leading to groundbreaking performance in 2D image generation. These advancements have been extended to 3D models, enabling the generation of novel 3D objects from textual descriptions. This has evolved into NeRF editing methods, which allow the manipulation of existing 3D objects through textual conditioning. However, existing NeRF editing techniques have faced limitations in their performance due to slow training speeds and the use of loss functions that do not adequately consider editing. To address this, here we present a novel 3D NeRF editing approach dubbed ED-NeRF by successfully embedding real-world scenes into the latent space of the latent diffusion model (LDM) through a unique refinement layer. This approach enables us to obtain a NeRF backbone that is not only faster but also more amenable to editing compared to traditional image space NeRF editing. Furthermore, we propose an improved loss function tailored for editing by migrating the delta denoising score (DDS) distillation loss, originally used in 2D image editing to the three-dimensional domain. This novel loss function surpasses the well-known score distillation sampling (SDS) loss in terms of suitability for editing purposes. Our experimental results demonstrate that ED-NeRF achieves faster editing speed while producing improved output quality compared to state-of-the-art 3D editing models.
Related papers
- DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation [17.930032337081673]
Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks.
We propose DreamCatalyst, a novel framework that considers these sampling dynamics in the SDS framework.
Our method offers two modes: (1) a fast mode that edits scenes 23 times faster than current state-of-the-art NeRF editing methods, and (2) a high-quality mode that produces superior results about 8 times faster than these methods.
arXiv Detail & Related papers (2024-07-16T05:26:14Z) - Preserving Identity with Variational Score for General-purpose 3D Editing [48.314327790451856]
Piva is a novel optimization-based method for editing images and 3D models based on diffusion models.
We pinpoint the limitations in 2D and 3D editing, which causes detail loss and oversaturation.
We propose an additional score distillation term that enforces identity preservation.
arXiv Detail & Related papers (2024-06-13T09:32:40Z) - DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting.
Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z) - DATENeRF: Depth-Aware Text-based Editing of NeRFs [49.08848777124736]
We introduce an inpainting approach that leverages the depth information of NeRF scenes to distribute 2D edits across different images.
Our results reveal that this methodology achieves more consistent, lifelike, and detailed edits than existing leading methods for text-driven NeRF scene editing.
arXiv Detail & Related papers (2024-04-06T06:48:16Z) - Editing 3D Scenes via Text Prompts without Retraining [80.57814031701744]
DN2N is a text-driven editing method that allows for the direct acquisition of a NeRF model with universal editing capabilities.
Our method employs off-the-shelf text-based editing models of 2D images to modify the 3D scene images.
Our method achieves multiple editing types, including but not limited to appearance editing, weather transition, material changing, and style transfer.
arXiv Detail & Related papers (2023-09-10T02:31:50Z) - RePaint-NeRF: NeRF Editting via Semantic Masks and Diffusion Models [36.236190350126826]
We propose a novel framework that can take RGB images as input and alter the 3D content in neural scenes.
Specifically, we semantically select the target object and a pre-trained diffusion model will guide the NeRF model to generate new 3D objects.
Experiment results show that our algorithm is effective for editing 3D objects in NeRF under different text prompts.
arXiv Detail & Related papers (2023-06-09T04:49:31Z) - FaceDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and
Relighting with Diffusion Models [67.17713009917095]
We propose Face Diffusion NeRF (FaceDNeRF), a new generative method to reconstruct high-quality Face NeRFs from single images.
With carefully designed illumination and identity preserving loss, FaceDNeRF offers users unparalleled control over the editing process.
arXiv Detail & Related papers (2023-06-01T15:14:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.