DreamEditor: Text-Driven 3D Scene Editing with Neural Fields
- URL: http://arxiv.org/abs/2306.13455v3
- Date: Thu, 7 Sep 2023 13:01:27 GMT
- Title: DreamEditor: Text-Driven 3D Scene Editing with Neural Fields
- Authors: Jingyu Zhuang, Chen Wang, Lingjie Liu, Liang Lin, Guanbin Li
- Abstract summary: We propose a novel framework that enables users to edit neural fields using text prompts.
DreamEditor generates highly realistic textures and geometry, significantly surpassing previous works in both quantitative and qualitative evaluations.
- Score: 115.07896366760876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural fields have achieved impressive advancements in view synthesis and
scene reconstruction. However, editing these neural fields remains challenging
due to the implicit encoding of geometry and texture information. In this
paper, we propose DreamEditor, a novel framework that enables users to perform
controlled editing of neural fields using text prompts. By representing scenes
as mesh-based neural fields, DreamEditor allows localized editing within
specific regions. DreamEditor utilizes the text encoder of a pretrained
text-to-Image diffusion model to automatically identify the regions to be
edited based on the semantics of the text prompts. Subsequently, DreamEditor
optimizes the editing region and aligns its geometry and texture with the text
prompts through score distillation sampling [29]. Extensive experiments have
demonstrated that DreamEditor can accurately edit neural fields of real-world
scenes according to the given text prompts while ensuring consistency in
irrelevant areas. DreamEditor generates highly realistic textures and geometry,
significantly surpassing previous works in both quantitative and qualitative
evaluations.
Related papers
- GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization [11.170354299559998]
We propose GSEditPro, a novel 3D scene editing framework which allows users to perform various creative and precise editing using text prompts only.
We introduce an attention-based progressive localization module to add semantic labels to each Gaussian during rendering.
This enables precise localization on editing areas by classifying Gaussians based on their relevance to the editing prompts derived from cross-attention layers of the T2I model.
arXiv Detail & Related papers (2024-11-15T08:25:14Z) - TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts [119.84478647745658]
TIPEditor is a 3D scene editing framework that accepts both text and image prompts and a 3D bounding box to specify the editing region.
Experiments have demonstrated that TIP-Editor conducts accurate editing following the text and image prompts in the specified bounding box region.
arXiv Detail & Related papers (2024-01-26T12:57:05Z) - LatentEditor: Text Driven Local Editing of 3D Scenes [8.966537479017951]
We introduce textscLatentEditor, a framework for precise and locally controlled editing of neural fields using text prompts.
We successfully embed real-world scenes into the latent space, resulting in a faster and more adaptable NeRF backbone for editing.
Our approach achieves faster editing speeds and superior output quality compared to existing 3D editing models.
arXiv Detail & Related papers (2023-12-14T19:38:06Z) - Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
Local-Global Iterative Training [61.984277261016146]
We propose a CustomNeRF model that unifies a text description or a reference image as the editing prompt.
To tackle the first challenge, we propose a Local-Global Iterative Editing (LGIE) training scheme that alternates between foreground region editing and full-image editing.
For the second challenge, we also design a class-guided regularization that exploits class priors within the generation model to alleviate the inconsistency problem.
arXiv Detail & Related papers (2023-12-04T06:25:06Z) - Text-Driven Image Editing via Learnable Regions [74.45313434129005]
We introduce a method for region-based image editing driven by textual prompts, without the need for user-provided masks or sketches.
We show that this simple approach enables flexible editing that is compatible with current image generation models.
Experiments demonstrate the competitive performance of our method in manipulating images with high fidelity and realism that correspond to the provided language descriptions.
arXiv Detail & Related papers (2023-11-28T02:27:31Z) - SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing
Field [37.8162035179377]
We present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image.
To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space.
Our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes.
arXiv Detail & Related papers (2023-03-23T13:58:11Z) - Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image
Inpainting [53.708523312636096]
We present Imagen Editor, a cascaded diffusion model built, by fine-tuning on text-guided image inpainting.
edits are faithful to the text prompts, which is accomplished by using object detectors to propose inpainting masks during training.
To improve qualitative and quantitative evaluation, we introduce EditBench, a systematic benchmark for text-guided image inpainting.
arXiv Detail & Related papers (2022-12-13T21:25:11Z) - Exploring Stroke-Level Modifications for Scene Text Editing [86.33216648792964]
Scene text editing (STE) aims to replace text with the desired one while preserving background and styles of the original text.
Previous methods of editing the whole image have to learn different translation rules of background and text regions simultaneously.
We propose a novel network by MOdifying Scene Text image at strokE Level (MOSTEL)
arXiv Detail & Related papers (2022-12-05T02:10:59Z) - NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for
Geometry and Texture Editing [39.71252429542249]
We present a novel mesh-based representation by encoding the neural implicit field with disentangled geometry and texture codes on mesh vertices.
We develop several techniques including learnable sign indicators to magnify spatial distinguishability of mesh-based representation.
Experiments and editing examples on both real and synthetic data demonstrate the superiority of our method on representation quality and editing ability.
arXiv Detail & Related papers (2022-07-25T05:30:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.