Related papers: SceneX:Procedural Controllable Large-scale Scene Generation via Large-language Models

SceneX:Procedural Controllable Large-scale Scene Generation via Large-language Models

URL: http://arxiv.org/abs/2403.15698v2
Date: Tue, 30 Jul 2024 15:41:41 GMT
Title: SceneX:Procedural Controllable Large-scale Scene Generation via Large-language Models
Authors: Mengqi Zhou, Yuxi Wang, Jun Hou, Chuanchen Luo, Zhaoxiang Zhang, Junran Peng,
Abstract summary: We introduce a large-scale scene generation framework, SceneX, which can automatically produce high-quality procedural models according to designers' textual descriptions. Our SceneX can generate a city spanning 2.5 km times 2.5 km with delicate geometric layout and structures, drastically reducing the time cost from several weeks for professional PCG engineers to just a few hours for an ordinary user.
Score: 53.961002112433576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to its great application potential, large-scale scene generation has drawn extensive attention in academia and industry. Recent research employs powerful generative models to create desired scenes and achieves promising results. However, most of these methods represent the scene using 3D primitives (e.g. point cloud or radiance field) incompatible with the industrial pipeline, which leads to a substantial gap between academic research and industrial deployment. Procedural Controllable Generation (PCG) is an efficient technique for creating scalable and high-quality assets, but it is unfriendly for ordinary users as it demands profound domain expertise. To address these issues, we resort to using the large language model (LLM) to drive the procedural modeling. In this paper, we introduce a large-scale scene generation framework, SceneX, which can automatically produce high-quality procedural models according to designers' textual descriptions.Specifically, the proposed method comprises two components, PCGBench and PCGPlanner. The former encompasses an extensive collection of accessible procedural assets and thousands of hand-craft API documents. The latter aims to generate executable actions for Blender to produce controllable and precise 3D assets guided by the user's instructions. Our SceneX can generate a city spanning 2.5 km times 2.5 km with delicate layout and geometric structures, drastically reducing the time cost from several weeks for professional PCG engineers to just a few hours for an ordinary user. Extensive experiments demonstrated the capability of our method in controllable large-scale scene generation and editing, including asset placement and season translation.

Related papers

WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents [67.31920821192323]
We introduce WorldCraft, a system where large language model (LLM) agents leverage procedural generation to create scenes populated with objects. In our framework, a coordinator agent manages the overall process and works with two specialized LLM agents to complete the scene creation. Our pipeline incorporates a trajectory control agent, allowing users to animate the scene and operate the camera through natural language interactions.
arXiv Detail & Related papers (2025-02-21T17:18:30Z)
PhiP-G: Physics-Guided Text-to-3D Compositional Scene Generation [5.554872561486615]
We propose a novel framework for compositional scene generation, PhiP-G. PhiP-G seamlessly integrates generation techniques with layout guidance based on a world model. Experiments demonstrate that PhiP-G significantly enhances the generation quality and physical rationality of the compositional scenes.
arXiv Detail & Related papers (2025-02-02T07:47:03Z)
Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians [65.09942210464747]
Building asset creation is labor-intensive and requires specialized skills to develop design rules. Recent generative models for building creation often overlook these patterns, leading to low visual fidelity and limited scalability. By manipulating procedural code, we can streamline this process and generate an infinite variety of buildings.
arXiv Detail & Related papers (2024-12-10T16:45:32Z)
Generating Compositional Scenes via Text-to-image RGBA Instance Generation [82.63805151691024]
Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering. We propose a novel multi-stage generation paradigm that is designed for fine-grained control, flexibility and interactivity. Our experiments show that our RGBA diffusion model is capable of generating diverse and high quality instances with precise control over object attributes.
arXiv Detail & Related papers (2024-11-16T23:44:14Z)
Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting [47.014044892025346]
Architect is a generative framework that creates complex and realistic 3D embodied environments leveraging diffusion-based 2D image inpainting. Our pipeline is further extended to a hierarchical and iterative inpainting process to continuously generate placement of large furniture and small objects to enrich the scene.
arXiv Detail & Related papers (2024-11-14T22:15:48Z)
SceneCraft: Layout-Guided 3D Scene Generation [29.713491313796084]
SceneCraft is a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences. Our method significantly outperforms existing approaches in complex indoor scene generation with diverse textures, consistent geometry, and realistic visual quality.
arXiv Detail & Related papers (2024-10-11T17:59:58Z)
CityX: Controllable Procedural Content Generation for Unbounded 3D Cities [55.737060358043536]
We propose a novel multi-modal controllable procedural content generation method, named CityX. It enhances realistic, unbounded 3D city generation guided by multiple layout conditions, including OSM, semantic maps, and satellite images. Through this effective framework, CityX shows the potential to build an innovative ecosystem for 3D scene generation.
arXiv Detail & Related papers (2024-07-24T18:05:13Z)
HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions [31.342899807980654]
3D scene generation is in high demand across various domains, including virtual reality, gaming, and the film industry. We introduce HoloDreamer, a framework that first generates high-definition panorama as a holistic initialization of the full 3D scene. We then leverage 3D Gaussian Splatting (3D-GS) to quickly reconstruct the 3D scene, thereby facilitating the creation of view-consistent and fully enclosed 3D scenes.
arXiv Detail & Related papers (2024-07-21T14:52:51Z)
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation. Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts. We develop an automated text-to-poster system that generates editable posters based on users' design intentions.
arXiv Detail & Related papers (2024-06-05T03:05:52Z)
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets [43.315487682462845]
CLAY is a 3D geometry and material generator designed to transform human imagination into intricate 3D digital structures. At its core is a large-scale generative model composed of a multi-resolution Variational Autoencoder (VAE) and a minimalistic latent Diffusion Transformer (DiT) We demonstrate using CLAY for a range of controllable 3D asset creations, from sketchy conceptual designs to production ready assets with intricate details.
arXiv Detail & Related papers (2024-05-30T05:57:36Z)
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene. Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z)
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting [52.150502668874495]
We present GALA3D, generative 3D GAussians with LAyout-guided control, for effective compositional text-to-3D generation. GALA3D is a user-friendly, end-to-end framework for state-of-the-art scene-level 3D content generation and controllable editing.
arXiv Detail & Related papers (2024-02-11T13:40:08Z)
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes. Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes. The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z)
NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models [85.20004959780132]
We introduce NeuralField-LDM, a generative model capable of synthesizing complex 3D environments. We show how NeuralField-LDM can be used for a variety of 3D content creation applications, including conditional scene generation, scene inpainting and scene style manipulation.
arXiv Detail & Related papers (2023-04-19T16:13:21Z)
Compositional Transformers for Scene Generation [13.633811200719627]
We introduce the GANformer2 model, an iterative object-oriented transformer, explored for the task of generative modeling. We show it achieves state-of-the-art performance in terms of visual quality, diversity and consistency. Further experiments demonstrate the model's disentanglement and provide a deeper insight into its generative process.
arXiv Detail & Related papers (2021-11-17T08:11:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.