Related papers: Generative AI for Urban Planning: Synthesizing Satellite Imagery via Diffusion Models

Generative AI for Urban Planning: Synthesizing Satellite Imagery via Diffusion Models

URL: http://arxiv.org/abs/2505.08833v1
Date: Tue, 13 May 2025 04:55:38 GMT
Title: Generative AI for Urban Planning: Synthesizing Satellite Imagery via Diffusion Models
Authors: Qingyi Wang, Yuebing Liang, Yunhan Zheng, Kaiyuan Xu, Jinhua Zhao, Shenhao Wang,
Abstract summary: We adapt a state-of-the-art Stable Diffusion model, extended with ControlNet, to generate high-fidelity satellite imagery conditioned on land use descriptions, infrastructure, and natural environments.<n>Using data from three major U.S. cities, we demonstrate that the proposed diffusion model generates realistic and diverse urban landscapes by varying land-use configurations, road networks, and water bodies.<n>Our model achieves high FID and KID scores and demonstrates robustness across diverse urban contexts.
Score: 9.385767746826286
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative AI offers new opportunities for automating urban planning by creating site-specific urban layouts and enabling flexible design exploration. However, existing approaches often struggle to produce realistic and practical designs at scale. Therefore, we adapt a state-of-the-art Stable Diffusion model, extended with ControlNet, to generate high-fidelity satellite imagery conditioned on land use descriptions, infrastructure, and natural environments. To overcome data availability limitations, we spatially link satellite imagery with structured land use and constraint information from OpenStreetMap. Using data from three major U.S. cities, we demonstrate that the proposed diffusion model generates realistic and diverse urban landscapes by varying land-use configurations, road networks, and water bodies, facilitating cross-city learning and design diversity. We also systematically evaluate the impacts of varying language prompts and control imagery on the quality of satellite imagery generation. Our model achieves high FID and KID scores and demonstrates robustness across diverse urban contexts. Qualitative assessments from urban planners and the general public show that generated images align closely with design descriptions and constraints, and are often preferred over real images. This work establishes a benchmark for controlled urban imagery generation and highlights the potential of generative AI as a tool for enhancing planning workflows and public engagement.

Related papers

Unsupervised Urban Land Use Mapping with Street View Contrastive Clustering and a Geographical Prior [16.334202302817783]
This study introduces an unsupervised contrastive clustering model for street view images with a built-in geographical prior.<n>We experimentally show that our method can generate land use maps from geotagged street view image datasets of two cities.
arXiv Detail & Related papers (2025-04-24T13:41:27Z)
AerialGo: Walking-through City View Generation from Aerial Perspectives [48.53976414257845]
AerialGo is a framework that generates realistic walking-through city views from aerial images.<n>By conditioning ground-view synthesis on accessible aerial data, AerialGo bypasses the privacy risks inherent in ground-level imagery.<n>Experiments show that AerialGo significantly enhances ground-level realism and structural coherence.
arXiv Detail & Related papers (2024-11-29T08:14:07Z)
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis [54.852701978617056]
CrossViewDiff is a cross-view diffusion model for satellite-to-street view synthesis. To address the challenges posed by the large discrepancy across views, we design the satellite scene structure estimation and cross-view texture mapping modules. To achieve a more comprehensive evaluation of the synthesis results, we additionally design a GPT-based scoring method.
arXiv Detail & Related papers (2024-08-27T03:41:44Z)
Urban Scene Diffusion through Semantic Occupancy Map [49.20779809250597]
UrbanDiffusion is a 3D diffusion model conditioned on a Bird's-Eye View (BEV) map. Our model learns the data distribution of scene-level structures within a latent space. After training on real-world driving datasets, our model can generate a wide range of diverse urban scenes.
arXiv Detail & Related papers (2024-03-18T11:54:35Z)
UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models [0.0]
This paper presents a novel workflow encapsulated within a prototype application, designed to leverage the synergies between advanced image segmentation and diffusion models for a comprehensive approach to urban design. validation results indicated a high degree of performance by the prototype application, showcasing significant accuracy in both object detection and text-to-image generation. Preliminary testing included utilising UrbanGenAI as an educational tool enhancing the learning experience in design pedagogy, and as a participatory instrument facilitating community-driven urban planning.
arXiv Detail & Related papers (2024-01-25T18:30:46Z)
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques. Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner. Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z)
Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years. Data-driven simulation for autonomous driving has been a focal point of recent research. We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z)
Generative methods for Urban design and rapid solution space exploration [13.222198221605701]
This research introduces an implementation of a tensor-field-based generative urban modeling toolkit. Our method encodes contextual constraints such as waterfront edges, terrain, view-axis, existing streets, landmarks, and non-geometric design inputs. This allows users to generate many, diverse urban fabric configurations that resemble real-world cities with very few model inputs.
arXiv Detail & Related papers (2022-12-13T17:58:02Z)
Deep Human-guided Conditional Variational Generative Modeling for Automated Urban Planning [30.614010268762115]
Urban planning designs land-use configurations and can benefit building livable, sustainable, safe communities. Inspired by image generation, deep urban planning aims to leverage deep learning to generate land-use configurations. This paper studies a novel deep human guided urban planning method to jointly solve the above challenges.
arXiv Detail & Related papers (2021-10-12T15:45:38Z)
Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics. We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form. After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.