Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
- URL: http://arxiv.org/abs/2211.07600v1
- Date: Mon, 14 Nov 2022 18:25:24 GMT
- Title: Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
- Authors: Gal Metzer and Elad Richardson and Or Patashnik and Raja Giryes and
Daniel Cohen-Or
- Abstract summary: We adapt the score distillation to the publicly available, and computationally efficient, Latent Diffusion Models.
Latent Diffusion Models apply the entire diffusion process in a compact latent space of a pretrained autoencoder.
We show that latent score distillation can be successfully applied directly on 3D meshes.
- Score: 72.44361273600207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-guided image generation has progressed rapidly in recent years,
inspiring major breakthroughs in text-guided shape generation. Recently, it has
been shown that using score distillation, one can successfully text-guide a
NeRF model to generate a 3D object. We adapt the score distillation to the
publicly available, and computationally efficient, Latent Diffusion Models,
which apply the entire diffusion process in a compact latent space of a
pretrained autoencoder. As NeRFs operate in image space, a naive solution for
guiding them with latent score distillation would require encoding to the
latent space at each guidance step. Instead, we propose to bring the NeRF to
the latent space, resulting in a Latent-NeRF. Analyzing our Latent-NeRF, we
show that while Text-to-3D models can generate impressive results, they are
inherently unconstrained and may lack the ability to guide or enforce a
specific 3D structure. To assist and direct the 3D generation, we propose to
guide our Latent-NeRF using a Sketch-Shape: an abstract geometry that defines
the coarse structure of the desired object. Then, we present means to integrate
such a constraint directly into a Latent-NeRF. This unique combination of text
and shape guidance allows for increased control over the generation process. We
also show that latent score distillation can be successfully applied directly
on 3D meshes. This allows for generating high-quality textures on a given
geometry. Our experiments validate the power of our different forms of guidance
and the efficiency of using latent rendering. Implementation is available at
https://github.com/eladrich/latent-nerf
Related papers
- GO-NeRF: Generating Virtual Objects in Neural Radiance Fields [75.13534508391852]
GO-NeRF is capable of utilizing scene context for high-quality and harmonious 3D object generation within an existing NeRF.
Our method employs a compositional rendering formulation that allows the generated 3D objects to be seamlessly composited into the scene.
arXiv Detail & Related papers (2024-01-11T08:58:13Z) - Points-to-3D: Bridging the Gap between Sparse Points and
Shape-Controllable Text-to-3D Generation [16.232803881159022]
We propose a flexible framework of Points-to-3D to bridge the gap between sparse yet freely available 3D points and realistic shape-controllable 3D generation.
The core idea of Points-to-3D is to introduce controllable sparse 3D points to guide the text-to-3D generation.
arXiv Detail & Related papers (2023-07-26T02:16:55Z) - TextMesh: Generation of Realistic 3D Meshes From Text Prompts [56.2832907275291]
We propose a novel method for generation of highly realistic-looking 3D meshes.
To this end, we extend NeRF to employ an SDF backbone, leading to improved 3D mesh extraction.
arXiv Detail & Related papers (2023-04-24T20:29:41Z) - DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model [15.091263190886337]
We propose a novel pipeline to generate a high-quality 3D NeRF model from a text prompt or a single image.
DitTO-NeRF consists of constructing high-quality partial 3D object for limited in-boundary (IB) angles using the given or text-generated 2D image from the frontal view.
We propose progressive 3D object reconstruction schemes in terms of scales (low to high resolution), angles (IB angles initially to outer-boundary (OB) later, and masks (object to background boundary) in our DITTO-NeRF.
arXiv Detail & Related papers (2023-04-06T02:27:22Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z) - 3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion [55.71215821923401]
We tackle the task of text-to-3D creation with pre-trained latent-based NeRFs (NeRFs that generate 3D objects given input latent code)
We propose a novel method named 3D-CLFusion which leverages the pre-trained latent-based NeRFs and performs fast 3D content creation in less than a minute.
arXiv Detail & Related papers (2023-03-21T15:38:26Z) - NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from
3D-aware Diffusion [107.67277084886929]
Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input.
We propose NerfDiff, which addresses this issue by distilling the knowledge of a 3D-aware conditional diffusion model (CDM) into NeRF through synthesizing and refining a set of virtual views at test time.
We further propose a novel NeRF-guided distillation algorithm that simultaneously generates 3D consistent virtual views from the CDM samples, and finetunes the NeRF based on the improved virtual views.
arXiv Detail & Related papers (2023-02-20T17:12:00Z) - 3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models [8.583859530633417]
We propose a diffusion model for neural implicit representations of 3D shapes that operates in the latent space of an auto-decoder.
This allows us to generate diverse and high quality 3D surfaces.
arXiv Detail & Related papers (2022-12-01T20:00:00Z) - 3D-aware Image Synthesis via Learning Structural and Textural
Representations [39.681030539374994]
We propose VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation.
Our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
arXiv Detail & Related papers (2021-12-20T18:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.