ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields
- URL: http://arxiv.org/abs/2402.00864v1
- Date: Thu, 1 Feb 2024 18:59:09 GMT
- Title: ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields
- Authors: Jiahua Dong and Yu-Xiong Wang
- Abstract summary: ViCA-NeRF is the first view-consistency-aware method for 3D editing with text instructions.
We exploit two sources of regularization that explicitly propagate the editing information across different views.
- Score: 45.020585071312475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce ViCA-NeRF, the first view-consistency-aware method for 3D
editing with text instructions. In addition to the implicit neural radiance
field (NeRF) modeling, our key insight is to exploit two sources of
regularization that explicitly propagate the editing information across
different views, thus ensuring multi-view consistency. For geometric
regularization, we leverage the depth information derived from NeRF to
establish image correspondences between different views. For learned
regularization, we align the latent codes in the 2D diffusion model between
edited and unedited images, enabling us to edit key views and propagate the
update throughout the entire scene. Incorporating these two strategies, our
ViCA-NeRF operates in two stages. In the initial stage, we blend edits from
different views to create a preliminary 3D edit. This is followed by a second
stage of NeRF training, dedicated to further refining the scene's appearance.
Experimental results demonstrate that ViCA-NeRF provides more flexible,
efficient (3 times faster) editing with higher levels of consistency and
details, compared with the state of the art. Our code is publicly available.
Related papers
- A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models [117.77807994397784]
Image editing aims to edit the given synthetic or real image to meet the specific requirements from users.
Recent significant advancement in this field is based on the development of text-to-image (T2I) diffusion models.
T2I-based image editing methods significantly enhance editing performance and offer a user-friendly interface for modifying content guided by multimodal inputs.
arXiv Detail & Related papers (2024-06-20T17:58:52Z) - Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection [60.47731445033151]
We propose a novel unified editing framework that combines the strengths of both approaches by utilizing only a basic 2D image text-to-image (T2I) diffusion model.
Experimental results confirm that our method enables editing across diverse modalities including 3D scenes, videos, and panorama images.
arXiv Detail & Related papers (2024-05-27T04:44:36Z) - DATENeRF: Depth-Aware Text-based Editing of NeRFs [49.08848777124736]
We introduce an inpainting approach that leverages the depth information of NeRF scenes to distribute 2D edits across different images.
Our results reveal that this methodology achieves more consistent, lifelike, and detailed edits than existing leading methods for text-driven NeRF scene editing.
arXiv Detail & Related papers (2024-04-06T06:48:16Z) - GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing [38.948892064761914]
GaussCtrl is a text-driven method to edit a 3D scene reconstructed by the 3D Gaussian Splatting (3DGS)
Our key contribution is multi-view consistent editing, which enables editing all images together instead of iteratively editing one image.
arXiv Detail & Related papers (2024-03-13T17:35:28Z) - Consolidating Attention Features for Multi-view Image Editing [126.19731971010475]
We focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views.
We introduce QNeRF, a neural radiance field trained on the internal query features of the edited images.
We refine the process through a progressive, iterative method that better consolidates queries across the diffusion timesteps.
arXiv Detail & Related papers (2024-02-22T18:50:18Z) - LatentEditor: Text Driven Local Editing of 3D Scenes [8.966537479017951]
We introduce textscLatentEditor, a framework for precise and locally controlled editing of neural fields using text prompts.
We successfully embed real-world scenes into the latent space, resulting in a faster and more adaptable NeRF backbone for editing.
Our approach achieves faster editing speeds and superior output quality compared to existing 3D editing models.
arXiv Detail & Related papers (2023-12-14T19:38:06Z) - SKED: Sketch-guided Text-based 3D Editing [49.019881133348775]
We present SKED, a technique for editing 3D shapes represented by NeRFs.
Our technique utilizes as few as two guiding sketches from different views to alter an existing neural field.
We propose novel loss functions to generate the desired edits while preserving the density and radiance of the base instance.
arXiv Detail & Related papers (2023-03-19T18:40:44Z) - 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with
Text-guided Diffusion Models [71.25937799010407]
We equip text-guided diffusion models to achieve 3D-consistent generation.
We study 3D local editing and propose a two-step solution.
We extend our model to perform one-shot novel view synthesis.
arXiv Detail & Related papers (2022-11-25T13:50:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.