DualNeRF: Text-Driven 3D Scene Editing via Dual-Field Representation
- URL: http://arxiv.org/abs/2502.16302v1
- Date: Sat, 22 Feb 2025 17:21:55 GMT
- Title: DualNeRF: Text-Driven 3D Scene Editing via Dual-Field Representation
- Authors: Yuxuan Xiong, Yue Shi, Yishun Dou, Bingbing Ni,
- Abstract summary: Instruct-NeRF2NeRF (IN2N) introduces the success of diffusion into 3D scene editing through an "Iterative dataset update" (IDU) strategy.<n>IN2N suffers from problems of blurry backgrounds and trapping in local optima.<n>We introduce DualNeRF to deal with these problems.
- Score: 38.362924589327356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, denoising diffusion models have achieved promising results in 2D image generation and editing. Instruct-NeRF2NeRF (IN2N) introduces the success of diffusion into 3D scene editing through an "Iterative dataset update" (IDU) strategy. Though achieving fascinating results, IN2N suffers from problems of blurry backgrounds and trapping in local optima. The first problem is caused by IN2N's lack of efficient guidance for background maintenance, while the second stems from the interaction between image editing and NeRF training during IDU. In this work, we introduce DualNeRF to deal with these problems. We propose a dual-field representation to preserve features of the original scene and utilize them as additional guidance to the model for background maintenance during IDU. Moreover, a simulated annealing strategy is embedded into IDU to endow our model with the power of addressing local optima issues. A CLIP-based consistency indicator is used to further improve the editing quality by filtering out low-quality edits. Extensive experiments demonstrate that our method outperforms previous methods both qualitatively and quantitatively.
Related papers
- RI3D: Few-Shot Gaussian Splatting With Repair and Inpainting Diffusion Priors [13.883695200241524]
RI3D is a novel approach that harnesses the power of diffusion models to reconstruct high-quality novel views given a sparse set of input images.
Our key contribution is separating the view synthesis process into two tasks of reconstructing visible regions and hallucinating missing regions.
We produce results with detailed textures in both visible and missing regions that outperform state-of-the-art approaches on a diverse set of scenes.
arXiv Detail & Related papers (2025-03-13T20:16:58Z) - SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing [58.22339174221563]
We propose SyncNoise, a novel geometry-guided multi-view consistent noise editing approach for high-fidelity 3D scene editing.
SyncNoise synchronously edits multiple views with 2D diffusion models while enforcing multi-view noise predictions to be geometrically consistent.
Our method achieves high-quality 3D editing results respecting the textual instructions, especially in scenes with complex textures.
arXiv Detail & Related papers (2024-06-25T09:17:35Z) - Preserving Identity with Variational Score for General-purpose 3D Editing [48.314327790451856]
Piva is a novel optimization-based method for editing images and 3D models based on diffusion models.
We pinpoint the limitations in 2D and 3D editing, which causes detail loss and oversaturation.
We propose an additional score distillation term that enforces identity preservation.
arXiv Detail & Related papers (2024-06-13T09:32:40Z) - Zero-Shot Video Editing through Adaptive Sliding Score Distillation [51.57440923362033]
This study proposes a novel paradigm of video-based score distillation, facilitating direct manipulation of original video content.
We propose an Adaptive Sliding Score Distillation strategy, which incorporates both global and local video guidance to reduce the impact of editing errors.
arXiv Detail & Related papers (2024-06-07T12:33:59Z) - DATENeRF: Depth-Aware Text-based Editing of NeRFs [49.08848777124736]
We introduce an inpainting approach that leverages the depth information of NeRF scenes to distribute 2D edits across different images.
Our results reveal that this methodology achieves more consistent, lifelike, and detailed edits than existing leading methods for text-driven NeRF scene editing.
arXiv Detail & Related papers (2024-04-06T06:48:16Z) - ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields [45.020585071312475]
ViCA-NeRF is the first view-consistency-aware method for 3D editing with text instructions.
We exploit two sources of regularization that explicitly propagate the editing information across different views.
arXiv Detail & Related papers (2024-02-01T18:59:09Z) - LatentEditor: Text Driven Local Editing of 3D Scenes [8.966537479017951]
We introduce textscLatentEditor, a framework for precise and locally controlled editing of neural fields using text prompts.
We successfully embed real-world scenes into the latent space, resulting in a faster and more adaptable NeRF backbone for editing.
Our approach achieves faster editing speeds and superior output quality compared to existing 3D editing models.
arXiv Detail & Related papers (2023-12-14T19:38:06Z) - ED-NeRF: Efficient Text-Guided Editing of 3D Scene with Latent Space NeRF [60.47731445033151]
We present a novel 3D NeRF editing approach dubbed ED-NeRF.
We embed real-world scenes into the latent space of the latent diffusion model (LDM) through a unique refinement layer.
This approach enables us to obtain a NeRF backbone that is not only faster but also more amenable to editing.
arXiv Detail & Related papers (2023-10-04T10:28:38Z) - In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing [28.790900756506833]
3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts.
GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code.
We address this issue by explicitly modeling OOD objects from the input in 3D-aware GANs.
arXiv Detail & Related papers (2023-02-09T18:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.