3D Scene Generation: A Survey
- URL: http://arxiv.org/abs/2505.05474v1
- Date: Thu, 08 May 2025 17:59:54 GMT
- Title: 3D Scene Generation: A Survey
- Authors: Beichen Wen, Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu,
- Abstract summary: 3D scene generation seeks to synthesize spatially structured, semantically meaningful, and photorealistic environments for applications such as immersive media, robotics, autonomous driving, and embodied AI.<n>This review organizes recent advances in 3D scene generation and highlights promising directions at the intersection of generative AI, 3D vision, and embodied intelligence.
- Score: 41.202497008985425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D scene generation seeks to synthesize spatially structured, semantically meaningful, and photorealistic environments for applications such as immersive media, robotics, autonomous driving, and embodied AI. Early methods based on procedural rules offered scalability but limited diversity. Recent advances in deep generative models (e.g., GANs, diffusion models) and 3D representations (e.g., NeRF, 3D Gaussians) have enabled the learning of real-world scene distributions, improving fidelity, diversity, and view consistency. Recent advances like diffusion models bridge 3D scene synthesis and photorealism by reframing generation as image or video synthesis problems. This survey provides a systematic overview of state-of-the-art approaches, organizing them into four paradigms: procedural generation, neural 3D-based generation, image-based generation, and video-based generation. We analyze their technical foundations, trade-offs, and representative results, and review commonly used datasets, evaluation protocols, and downstream applications. We conclude by discussing key challenges in generation capacity, 3D representation, data and annotations, and evaluation, and outline promising directions including higher fidelity, physics-aware and interactive generation, and unified perception-generation models. This review organizes recent advances in 3D scene generation and highlights promising directions at the intersection of generative AI, 3D vision, and embodied intelligence. To track ongoing developments, we maintain an up-to-date project page: https://github.com/hzxie/Awesome-3D-Scene-Generation.
Related papers
- Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey [154.50661618628433]
3D reconstruction and view synthesis are foundational problems in computer vision, graphics, and immersive technologies such as augmented reality (AR), virtual reality (VR), and digital twins.<n>Recent advances in feed-forward approaches, driven by deep learning, have revolutionized this field by enabling fast and generalizable 3D reconstruction and view synthesis.
arXiv Detail & Related papers (2025-07-19T06:13:25Z) - HuGeDiff: 3D Human Generation via Diffusion with Gaussian Splatting [33.9893684177763]
Current methods struggle with fine detail, accurate rendering of hands and faces, human realism, and controlability over appearance.<n>We present a weakly supervised pipeline that tries to address these challenges.<n>We demonstrate orders-of-magnitude speed-ups in 3D human generation compared to the state-of-the-art approaches.
arXiv Detail & Related papers (2025-06-04T18:11:23Z) - Recent Advance in 3D Object and Scene Generation: A Survey [14.673302810271219]
This survey aims to provide readers with a structured understanding of state-of-the-art 3D generation technologies.<n>We focus on three dominant paradigms: layout-guided compositional synthesis, 2D prior-based scene generation, and rule-driven modeling.
arXiv Detail & Related papers (2025-04-16T03:22:06Z) - Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation [51.36926306499593]
Prometheus is a 3D-aware latent diffusion model for text-to-3D generation at both object and scene levels in seconds.<n>We formulate 3D scene generation as multi-view, feed-forward, pixel-aligned 3D Gaussian generation within the latent diffusion paradigm.
arXiv Detail & Related papers (2024-12-30T17:44:23Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - Advances in 3D Generation: A Survey [54.95024616672868]
The field of 3D content generation is developing rapidly, enabling the creation of increasingly high-quality and diverse 3D models.
Specifically, we introduce the 3D representations that serve as the backbone for 3D generation.
We provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms.
arXiv Detail & Related papers (2024-01-31T13:06:48Z) - Progress and Prospects in 3D Generative AI: A Technical Overview
including 3D human [51.58094069317723]
This paper aims to provide a comprehensive overview and summary of the relevant papers published mostly during the latter half year of 2023.
It will begin by discussing the AI generated object models in 3D, followed by the generated 3D human models, and finally, the generated 3D human motions, culminating in a conclusive summary and a vision for the future.
arXiv Detail & Related papers (2024-01-05T03:41:38Z) - GINA-3D: Learning to Generate Implicit Neural Assets in the Wild [38.51391650845503]
GINA-3D is a generative model that uses real-world driving data from camera and LiDAR sensors to create 3D implicit neural assets of diverse vehicles and pedestrians.
We construct a large-scale object-centric dataset containing over 1.2M images of vehicles and pedestrians.
We demonstrate that it achieves state-of-the-art performance in quality and diversity for both generated images and geometries.
arXiv Detail & Related papers (2023-04-04T23:41:20Z) - Deep Generative Models on 3D Representations: A Survey [81.73385191402419]
Generative models aim to learn the distribution of observed data by generating new instances.
Recently, researchers started to shift focus from 2D to 3D space.
representing 3D data poses significantly greater challenges.
arXiv Detail & Related papers (2022-10-27T17:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.