SceneX: Procedural Controllable Large-scale Scene Generation
- URL: http://arxiv.org/abs/2403.15698v3
- Date: Tue, 17 Dec 2024 14:39:07 GMT
- Title: SceneX: Procedural Controllable Large-scale Scene Generation
- Authors: Mengqi Zhou, Yuxi Wang, Jun Hou, Shougao Zhang, Yiwei Li, Chuanchen Luo, Junran Peng, Zhaoxiang Zhang,
- Abstract summary: We introduce SceneX, which can automatically produce high-quality procedural models according to designers' textual descriptions.
The proposed method comprises two components, PCGHub and PCGPlanner.
The latter aims to generate executable actions for Blender to produce controllable and precise 3D assets guided by the user's instructions.
- Score: 52.4743878200172
- License:
- Abstract: Developing comprehensive explicit world models is crucial for understanding and simulating real-world scenarios. Recently, Procedural Controllable Generation (PCG) has gained significant attention in large-scale scene generation by enabling the creation of scalable, high-quality assets. However, PCG faces challenges such as limited modular diversity, high expertise requirements, and challenges in managing the diverse elements and structures in complex scenes. In this paper, we introduce a large-scale scene generation framework, SceneX, which can automatically produce high-quality procedural models according to designers' textual descriptions. Specifically, the proposed method comprises two components, PCGHub and PCGPlanner. The former encompasses an extensive collection of accessible procedural assets and thousands of hand-craft API documents to perform as a standard protocol for PCG controller. The latter aims to generate executable actions for Blender to produce controllable and precise 3D assets guided by the user's instructions. Extensive experiments demonstrated the capability of our method in controllable large-scale scene generation, including nature scenes and unbounded cities, as well as scene editing such as asset placement and season translation.
Related papers
- PhiP-G: Physics-Guided Text-to-3D Compositional Scene Generation [5.554872561486615]
We propose a novel framework for compositional scene generation, PhiP-G.
PhiP-G seamlessly integrates generation techniques with layout guidance based on a world model.
Experiments demonstrate that PhiP-G significantly enhances the generation quality and physical rationality of the compositional scenes.
arXiv Detail & Related papers (2025-02-02T07:47:03Z) - Interactive Scene Authoring with Specialized Generative Primitives [25.378818867764323]
Specialized Generative Primitives is a generative framework that allows non-expert users to author high-quality 3D scenes.
Each primitive is an efficient generative model that captures the distribution of a single exemplar from the real world.
We showcase interactive sessions where various primitives are extracted from real-world scenes and controlled to create 3D assets and scenes in a few minutes.
arXiv Detail & Related papers (2024-12-20T04:39:50Z) - Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians [65.09942210464747]
Building asset creation is labor-intensive and requires specialized skills to develop design rules.
Recent generative models for building creation often overlook these patterns, leading to low visual fidelity and limited scalability.
By manipulating procedural code, we can streamline this process and generate an infinite variety of buildings.
arXiv Detail & Related papers (2024-12-10T16:45:32Z) - Generating Compositional Scenes via Text-to-image RGBA Instance Generation [82.63805151691024]
Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering.
We propose a novel multi-stage generation paradigm that is designed for fine-grained control, flexibility and interactivity.
Our experiments show that our RGBA diffusion model is capable of generating diverse and high quality instances with precise control over object attributes.
arXiv Detail & Related papers (2024-11-16T23:44:14Z) - PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation.
Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts.
We develop an automated text-to-poster system that generates editable posters based on users' design intentions.
arXiv Detail & Related papers (2024-06-05T03:05:52Z) - CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting [57.14748263512924]
CG3D is a method for compositionally generating scalable 3D assets.
Gamma radiance fields, parameterized to allow for compositions of objects, possess the capability to enable semantically and physically consistent scenes.
arXiv Detail & Related papers (2023-11-29T18:55:38Z) - CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes.
Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes.
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z) - Compositional Transformers for Scene Generation [13.633811200719627]
We introduce the GANformer2 model, an iterative object-oriented transformer, explored for the task of generative modeling.
We show it achieves state-of-the-art performance in terms of visual quality, diversity and consistency.
Further experiments demonstrate the model's disentanglement and provide a deeper insight into its generative process.
arXiv Detail & Related papers (2021-11-17T08:11:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.