Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
- URL: http://arxiv.org/abs/2401.14257v2
- Date: Sat, 27 Jan 2024 07:22:06 GMT
- Title: Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
- Authors: Minglin Chen and Weihao Yuan and Yukun Wang and Zhe Sheng and Yisheng
He and Zilong Dong and Liefeng Bo and Yulan Guo
- Abstract summary: We present a sketch-guided text-to-3D generation framework (namely, Sketch2NeRF) to add sketch control to 3D generation.
Our method achieves state-of-the-art performance in terms of sketch similarity and text alignment.
- Score: 37.93542778715304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, text-to-3D approaches have achieved high-fidelity 3D content
generation using text description. However, the generated objects are
stochastic and lack fine-grained control. Sketches provide a cheap approach to
introduce such fine-grained control. Nevertheless, it is challenging to achieve
flexible control from these sketches due to their abstraction and ambiguity. In
this paper, we present a multi-view sketch-guided text-to-3D generation
framework (namely, Sketch2NeRF) to add sketch control to 3D generation.
Specifically, our method leverages pretrained 2D diffusion models (e.g., Stable
Diffusion and ControlNet) to supervise the optimization of a 3D scene
represented by a neural radiance field (NeRF). We propose a novel synchronized
generation and reconstruction method to effectively optimize the NeRF. In the
experiments, we collected two kinds of multi-view sketch datasets to evaluate
the proposed method. We demonstrate that our method can synthesize 3D
consistent contents with fine-grained sketch control while being high-fidelity
to text prompts. Extensive results show that our method achieves
state-of-the-art performance in terms of sketch similarity and text alignment.
Related papers
- Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation [55.73399465968594]
This paper proposes a novel generation paradigm Sketch3D to generate realistic 3D assets with shape aligned with the input sketch and color matching the textual description.
Three strategies are designed to optimize 3D Gaussians, i.e., structural optimization via a distribution transfer mechanism, color optimization with a straightforward MSE loss and sketch similarity optimization with a CLIP-based geometric similarity loss.
arXiv Detail & Related papers (2024-04-02T11:03:24Z) - Sketch2Prototype: Rapid Conceptual Design Exploration and Prototyping with Generative AI [3.936104238911733]
Sketch2Prototype is an AI-based framework that transforms a hand-drawn sketch into a diverse set of 2D images and 3D prototypes.
We show that using text as an intermediate modality outperforms direct sketch-to-3D baselines for generating diverse and manufacturable 3D models.
arXiv Detail & Related papers (2024-03-26T02:12:17Z) - Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes [118.406721663244]
We introduce a novel part-level modelling and alignment framework that facilitates abstraction modelling and cross-modal correspondence.
Our approach seamlessly extends to sketch modelling by establishing correspondence between CLIPasso edgemaps and projected 3D part regions.
arXiv Detail & Related papers (2023-12-07T05:04:33Z) - 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with
2D Diffusion Models [102.75875255071246]
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community.
We propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models.
arXiv Detail & Related papers (2023-11-09T15:51:27Z) - Control3D: Towards Controllable Text-to-3D Generation [107.81136630589263]
We present a text-to-3D generation conditioning on the additional hand-drawn sketch, namely Control3D.
A 2D conditioned diffusion model (ControlNet) is remoulded to guide the learning of 3D scene parameterized as NeRF.
We exploit a pre-trained differentiable photo-to-sketch model to directly estimate the sketch of the rendered image over synthetic 3D scene.
arXiv Detail & Related papers (2023-11-09T15:50:32Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - SKED: Sketch-guided Text-based 3D Editing [49.019881133348775]
We present SKED, a technique for editing 3D shapes represented by NeRFs.
Our technique utilizes as few as two guiding sketches from different views to alter an existing neural field.
We propose novel loss functions to generate the desired edits while preserving the density and radiance of the base instance.
arXiv Detail & Related papers (2023-03-19T18:40:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.