DATENeRF: Depth-Aware Text-based Editing of NeRFs
- URL: http://arxiv.org/abs/2404.04526v2
- Date: Thu, 1 Aug 2024 11:17:28 GMT
- Title: DATENeRF: Depth-Aware Text-based Editing of NeRFs
- Authors: Sara Rojas, Julien Philip, Kai Zhang, Sai Bi, Fujun Luan, Bernard Ghanem, Kalyan Sunkavall,
- Abstract summary: We introduce an inpainting approach that leverages the depth information of NeRF scenes to distribute 2D edits across different images.
Our results reveal that this methodology achieves more consistent, lifelike, and detailed edits than existing leading methods for text-driven NeRF scene editing.
- Score: 49.08848777124736
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent advancements in diffusion models have shown remarkable proficiency in editing 2D images based on text prompts. However, extending these techniques to edit scenes in Neural Radiance Fields (NeRF) is complex, as editing individual 2D frames can result in inconsistencies across multiple views. Our crucial insight is that a NeRF scene's geometry can serve as a bridge to integrate these 2D edits. Utilizing this geometry, we employ a depth-conditioned ControlNet to enhance the coherence of each 2D image modification. Moreover, we introduce an inpainting approach that leverages the depth information of NeRF scenes to distribute 2D edits across different images, ensuring robustness against errors and resampling challenges. Our results reveal that this methodology achieves more consistent, lifelike, and detailed edits than existing leading methods for text-driven NeRF scene editing.
Related papers
- A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models [117.77807994397784]
Image editing aims to edit the given synthetic or real image to meet the specific requirements from users.
Recent significant advancement in this field is based on the development of text-to-image (T2I) diffusion models.
T2I-based image editing methods significantly enhance editing performance and offer a user-friendly interface for modifying content guided by multimodal inputs.
arXiv Detail & Related papers (2024-06-20T17:58:52Z) - NeRF-Insert: 3D Local Editing with Multimodal Control Signals [97.91172669905578]
NeRF-Insert is a NeRF editing framework that allows users to make high-quality local edits with a flexible level of control.
We cast scene editing as an in-painting problem, which encourages the global structure of the scene to be preserved.
Our results show better visual quality and also maintain stronger consistency with the original NeRF.
arXiv Detail & Related papers (2024-04-30T02:04:49Z) - ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields [45.020585071312475]
ViCA-NeRF is the first view-consistency-aware method for 3D editing with text instructions.
We exploit two sources of regularization that explicitly propagate the editing information across different views.
arXiv Detail & Related papers (2024-02-01T18:59:09Z) - Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
Local-Global Iterative Training [61.984277261016146]
We propose a CustomNeRF model that unifies a text description or a reference image as the editing prompt.
To tackle the first challenge, we propose a Local-Global Iterative Editing (LGIE) training scheme that alternates between foreground region editing and full-image editing.
For the second challenge, we also design a class-guided regularization that exploits class priors within the generation model to alleviate the inconsistency problem.
arXiv Detail & Related papers (2023-12-04T06:25:06Z) - ED-NeRF: Efficient Text-Guided Editing of 3D Scene with Latent Space NeRF [60.47731445033151]
We present a novel 3D NeRF editing approach dubbed ED-NeRF.
We embed real-world scenes into the latent space of the latent diffusion model (LDM) through a unique refinement layer.
This approach enables us to obtain a NeRF backbone that is not only faster but also more amenable to editing.
arXiv Detail & Related papers (2023-10-04T10:28:38Z) - SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing
Field [37.8162035179377]
We present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image.
To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space.
Our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes.
arXiv Detail & Related papers (2023-03-23T13:58:11Z) - Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions [109.51624993088687]
We propose a method for editing NeRF scenes with text-instructions.
Given a NeRF of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene.
We demonstrate that our proposed method is able to edit large-scale, real-world scenes, and is able to accomplish more realistic, targeted edits than prior work.
arXiv Detail & Related papers (2023-03-22T17:57:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.