Related papers: A Survey On Text-to-3D Contents Generation In The Wild

A Survey On Text-to-3D Contents Generation In The Wild

URL: http://arxiv.org/abs/2405.09431v1
Date: Wed, 15 May 2024 15:23:22 GMT
Title: A Survey On Text-to-3D Contents Generation In The Wild
Authors: Chenhan Jiang,
Abstract summary: 3D content creation plays a vital role in various applications, such as gaming, robotics simulation, and virtual reality. To address this challenge, text-to-3D generation technologies have emerged as a promising solution for automating 3D creation.
Score: 5.875257756382124
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D content creation plays a vital role in various applications, such as gaming, robotics simulation, and virtual reality. However, the process is labor-intensive and time-consuming, requiring skilled designers to invest considerable effort in creating a single 3D asset. To address this challenge, text-to-3D generation technologies have emerged as a promising solution for automating 3D creation. Leveraging the success of large vision language models, these techniques aim to generate 3D content based on textual descriptions. Despite recent advancements in this area, existing solutions still face significant limitations in terms of generation quality and efficiency. In this survey, we conduct an in-depth investigation of the latest text-to-3D creation methods. We provide a comprehensive background on text-to-3D creation, including discussions on datasets employed in training and evaluation metrics used to assess the quality of generated 3D models. Then, we delve into the various 3D representations that serve as the foundation for the 3D generation process. Furthermore, we present a thorough comparison of the rapidly growing literature on generative pipelines, categorizing them into feedforward generators, optimization-based generation, and view reconstruction approaches. By examining the strengths and weaknesses of these methods, we aim to shed light on their respective capabilities and limitations. Lastly, we point out several promising avenues for future research. With this survey, we hope to inspire researchers further to explore the potential of open-vocabulary text-conditioned 3D content creation.

Related papers

AI-powered Contextual 3D Environment Generation: A Systematic Review [49.1574468325115]
This study performs a systematic review of existing generative AI techniques for 3D scene generation.<n>By examining state-of-the-art approaches, it presents key challenges such as scene authenticity and the influence of textual inputs.
arXiv Detail & Related papers (2025-06-05T15:56:28Z)
Recent Advance in 3D Object and Scene Generation: A Survey [14.673302810271219]
This survey aims to provide readers with a structured understanding of state-of-the-art 3D generation technologies. We focus on three dominant paradigms: layout-guided compositional synthesis, 2D prior-based scene generation, and rule-driven modeling.
arXiv Detail & Related papers (2025-04-16T03:22:06Z)
GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information. The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z)
Text-to-3D Shape Generation [18.76771062964711]
Computational systems that can perform text-to-3D shape generation have captivated the popular imagination. We provide a survey of the underlying technology and methods enabling text-to-3D shape generation to summarize the background literature. We then derive a systematic categorization of recent work on text-to-3D shape generation based on the type of supervision data required.
arXiv Detail & Related papers (2024-03-20T04:03:44Z)
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene. Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z)
A Comprehensive Survey on 3D Content Generation [148.434661725242]
3D content generation shows both academic and practical values. New taxonomy is proposed that categorizes existing approaches into three types: 3D native generative methods, 2D prior-based 3D generative methods, and hybrid 3D generative methods.
arXiv Detail & Related papers (2024-02-02T06:20:44Z)
Advances in 3D Generation: A Survey [54.95024616672868]
The field of 3D content generation is developing rapidly, enabling the creation of increasingly high-quality and diverse 3D models. Specifically, we introduce the 3D representations that serve as the backbone for 3D generation. We provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms.
arXiv Detail & Related papers (2024-01-31T13:06:48Z)
Progress and Prospects in 3D Generative AI: A Technical Overview including 3D human [51.58094069317723]
This paper aims to provide a comprehensive overview and summary of the relevant papers published mostly during the latter half year of 2023. It will begin by discussing the AI generated object models in 3D, followed by the generated 3D human models, and finally, the generated 3D human motions, culminating in a conclusive summary and a vision for the future.
arXiv Detail & Related papers (2024-01-05T03:41:38Z)
VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder [56.59814904526965]
This paper introduces a pioneering 3D encoder designed for text-to-3D generation. A lightweight network is developed to efficiently acquire feature volumes from multi-view images. The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net.
arXiv Detail & Related papers (2023-12-18T18:59:05Z)
T$^3$Bench: Benchmarking Current Progress in Text-to-3D Generation [52.029698642883226]
Methods in text-to-3D leverage powerful pretrained diffusion models to optimize NeRF. Most studies evaluate their results with subjective case studies and user experiments. We introduce T$3$Bench, the first comprehensive text-to-3D benchmark.
arXiv Detail & Related papers (2023-10-04T17:12:18Z)
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era [36.66506237523448]
Generative AI has made significant progress in recent years, with text-guided content generation being the most practical. Thanks to advancements in text-to-image and 3D modeling technologies, like neural radiance field (NeRF), text-to-3D has emerged as a nascent yet highly active research field.
arXiv Detail & Related papers (2023-05-10T13:26:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.