UrbanWorld: An Urban World Model for 3D City Generation
- URL: http://arxiv.org/abs/2407.11965v2
- Date: Tue, 22 Oct 2024 08:31:44 GMT
- Title: UrbanWorld: An Urban World Model for 3D City Generation
- Authors: Yu Shang, Yuming Lin, Yu Zheng, Hangyu Fan, Jingtao Ding, Jie Feng, Jiansheng Chen, Li Tian, Yong Li,
- Abstract summary: UrbanWorld is a generative urban world model that can automatically create a customized, realistic and interactive 3D urban world with flexible control conditions.
We conduct extensive quantitative analysis on five visual metrics, demonstrating that UrbanWorld achieves SOTA generation realism.
We verify the interactive nature of these environments by showcasing the agent perception and navigation within the created environments.
- Score: 21.21375372182025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cities, as the essential environment of human life, encompass diverse physical elements such as buildings, roads and vegetation, which continuously interact with dynamic entities like people and vehicles. Crafting realistic, interactive 3D urban environments is essential for nurturing AGI systems and constructing AI agents capable of perceiving, decision-making, and acting like humans in real-world environments. However, creating high-fidelity 3D urban environments usually entails extensive manual labor from designers, involving intricate detailing and representation of complex urban elements. Therefore, accomplishing this automatically remains a longstanding challenge. Toward this problem, we propose UrbanWorld, the first generative urban world model that can automatically create a customized, realistic and interactive 3D urban world with flexible control conditions. UrbanWorld incorporates four key stages in the generation pipeline: flexible 3D layout generation from OSM data or urban layout with semantic and height maps, urban scene design with Urban MLLM, controllable urban asset rendering via progressive 3D diffusion, and MLLM-assisted scene refinement. We conduct extensive quantitative analysis on five visual metrics, demonstrating that UrbanWorld achieves SOTA generation realism. Next, we provide qualitative results about the controllable generation capabilities of UrbanWorld using both textual and image-based prompts. Lastly, we verify the interactive nature of these environments by showcasing the agent perception and navigation within the created environments. We contribute UrbanWorld as an open-source tool available at https://github.com/Urban-World/UrbanWorld.
Related papers
- CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities [44.203932215464214]
We propose a compositional generative model specifically tailored for generating 4D cities.
CityDreamer4D supports a range of downstream applications, such as instance editing, city stylization, and urban simulation.
arXiv Detail & Related papers (2025-01-15T17:59:56Z) - GenEx: Generating an Explorable World [59.0666303068111]
We introduce GenEx, a system capable of planning complex embodied world exploration, guided by its generative imagination.
GenEx generates an entire 3D-consistent imaginative environment from as little as a single RGB image.
GPT-assisted agents are equipped to perform complex embodied tasks, including both goal-agnostic exploration and goal-driven navigation.
arXiv Detail & Related papers (2024-12-12T18:59:57Z) - AerialGo: Walking-through City View Generation from Aerial Perspectives [48.53976414257845]
AerialGo is a framework that generates realistic walking-through city views from aerial images.
By conditioning ground-view synthesis on accessible aerial data, AerialGo bypasses the privacy risks inherent in ground-level imagery.
Experiments show that AerialGo significantly enhances ground-level realism and structural coherence.
arXiv Detail & Related papers (2024-11-29T08:14:07Z) - CityX: Controllable Procedural Content Generation for Unbounded 3D Cities [50.10101235281943]
Current generative methods fall short in either diversity, controllability, or fidelity.
In this work, we resort to the procedural content generation (PCG) technique for high-fidelity generation.
We develop a multi-agent framework to transform multi-modal instructions, including OSM, semantic maps, and satellite images, into executable programs.
Our method, named CityX, demonstrates its superiority in creating diverse, controllable, and realistic 3D urban scenes.
arXiv Detail & Related papers (2024-07-24T18:05:13Z) - MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility [52.0930915607703]
Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans.
Micromobility enabled by AI for short-distance travel in public urban spaces plays a crucial component in the future transportation system.
We present MetaUrban, a compositional simulation platform for the AI-driven urban micromobility research.
arXiv Detail & Related papers (2024-07-11T17:56:49Z) - CityCraft: A Real Crafter for 3D City Generation [25.7885801163556]
CityCraft is an innovative framework designed to enhance both the diversity and quality of urban scene generation.
Our approach integrates three key stages: initially, a diffusion transformer (DiT) model is deployed to generate diverse and controllable 2D city layouts.
Based on the generated layout and city plan, we utilize the asset retrieval module and Blender for precise asset placement and scene construction.
arXiv Detail & Related papers (2024-06-07T14:49:00Z) - Urban Scene Diffusion through Semantic Occupancy Map [49.20779809250597]
UrbanDiffusion is a 3D diffusion model conditioned on a Bird's-Eye View (BEV) map.
Our model learns the data distribution of scene-level structures within a latent space.
After training on real-world driving datasets, our model can generate a wide range of diverse urban scenes.
arXiv Detail & Related papers (2024-03-18T11:54:35Z) - Urban Generative Intelligence (UGI): A Foundational Platform for Agents
in Embodied City Environment [32.53845672285722]
Urban environments, characterized by their complex, multi-layered networks, face significant challenges in the face of rapid urbanization.
Recent developments in big data, artificial intelligence, urban computing, and digital twins have laid the groundwork for sophisticated city modeling and simulation.
This paper proposes Urban Generative Intelligence (UGI), a novel foundational platform integrating Large Language Models (LLMs) into urban systems.
arXiv Detail & Related papers (2023-12-19T03:12:13Z) - CityDreamer: Compositional Generative Model of Unbounded 3D Cities [44.203932215464214]
CityDreamer is a compositional generative model designed specifically for unbounded 3D cities.
We adopt the bird's eye view scene representation and employ a volumetric render for both instance-oriented and stuff-oriented neural fields.
CityDreamer achieves state-of-the-art performance not only in generating realistic 3D cities but also in localized editing within the generated cities.
arXiv Detail & Related papers (2023-09-01T17:57:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.