Deep Generative Models on 3D Representations: A Survey
- URL: http://arxiv.org/abs/2210.15663v3
- Date: Mon, 28 Aug 2023 03:02:16 GMT
- Title: Deep Generative Models on 3D Representations: A Survey
- Authors: Zifan Shi, Sida Peng, Yinghao Xu, Andreas Geiger, Yiyi Liao, and Yujun
Shen
- Abstract summary: Generative models aim to learn the distribution of observed data by generating new instances.
Recently, researchers started to shift focus from 2D to 3D space.
representing 3D data poses significantly greater challenges.
- Score: 81.73385191402419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models aim to learn the distribution of observed data by
generating new instances. With the advent of neural networks, deep generative
models, including variational autoencoders (VAEs), generative adversarial
networks (GANs), and diffusion models (DMs), have progressed remarkably in
synthesizing 2D images. Recently, researchers started to shift focus from 2D to
3D space, considering that 3D data is more closely aligned with our physical
world and holds immense practical potential. However, unlike 2D images, which
possess an inherent and efficient representation (\textit{i.e.}, a pixel grid),
representing 3D data poses significantly greater challenges. Ideally, a robust
3D representation should be capable of accurately modeling complex shapes and
appearances while being highly efficient in handling high-resolution data with
high processing speeds and low memory requirements. Regrettably, existing 3D
representations, such as point clouds, meshes, and neural fields, often fail to
satisfy all of these requirements simultaneously. In this survey, we thoroughly
review the ongoing developments of 3D generative models, including methods that
employ 2D and 3D supervision. Our analysis centers on generative models, with a
particular focus on the representations utilized in this context. We believe
our survey will help the community to track the field's evolution and to spark
innovative ideas to propel progress towards solving this challenging task.
Related papers
- Diffusion Models in 3D Vision: A Survey [11.116658321394755]
We review the state-of-the-art approaches that leverage diffusion models for 3D visual tasks.
These approaches include 3D object generation, shape completion, point cloud reconstruction, and scene understanding.
We discuss potential solutions, including improving computational efficiency, enhancing multimodal fusion, and exploring the use of large-scale pretraining.
arXiv Detail & Related papers (2024-10-07T04:12:23Z) - Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field.
We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models.
This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z) - DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Retrieval-Augmented Score Distillation for Text-to-3D Generation [30.57225047257049]
We introduce novel framework for retrieval-based quality enhancement in text-to-3D generation.
We conduct extensive experiments to demonstrate that ReDream exhibits superior quality with increased geometric consistency.
arXiv Detail & Related papers (2024-02-05T12:50:30Z) - Progress and Prospects in 3D Generative AI: A Technical Overview
including 3D human [51.58094069317723]
This paper aims to provide a comprehensive overview and summary of the relevant papers published mostly during the latter half year of 2023.
It will begin by discussing the AI generated object models in 3D, followed by the generated 3D human models, and finally, the generated 3D human motions, culminating in a conclusive summary and a vision for the future.
arXiv Detail & Related papers (2024-01-05T03:41:38Z) - 3D GANs and Latent Space: A comprehensive survey [0.0]
3D GANs are a new type of generative model used for 3D reconstruction, point cloud reconstruction, and 3D semantic scene completion.
The choice of distribution for noise is critical as it represents the latent space.
In this work, we explore the latent space and 3D GANs, examine several GAN variants and training methods to gain insights into improving 3D GAN training, and suggest potential future directions for further research.
arXiv Detail & Related papers (2023-04-08T06:36:07Z) - HoloDiffusion: Training a 3D Diffusion Model using 2D Images [71.1144397510333]
We introduce a new diffusion setup that can be trained, end-to-end, with only posed 2D images for supervision.
We show that our diffusion models are scalable, train robustly, and are competitive in terms of sample quality and fidelity to existing approaches for 3D generative modeling.
arXiv Detail & Related papers (2023-03-29T07:35:56Z) - 3D Neural Field Generation using Triplane Diffusion [37.46688195622667]
We present an efficient diffusion-based model for 3D-aware generation of neural fields.
Our approach pre-processes training data, such as ShapeNet meshes, by converting them to continuous occupancy fields.
We demonstrate state-of-the-art results on 3D generation on several object classes from ShapeNet.
arXiv Detail & Related papers (2022-11-30T01:55:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.