Sat2RealCity: Geometry-Aware and Appearance-Controllable 3D Urban Generation from Satellite Imagery
- URL: http://arxiv.org/abs/2511.11470v1
- Date: Fri, 14 Nov 2025 16:42:03 GMT
- Title: Sat2RealCity: Geometry-Aware and Appearance-Controllable 3D Urban Generation from Satellite Imagery
- Authors: Yijie Kang, Xinliang Wang, Zhenyu Wu, Yifeng Shi, Hailong Zhu,
- Abstract summary: Sat2RealCity is a geometry-aware and appearance-controllable framework for 3D urban generation from real-world satellite imagery.<n>We introduce the OSM-based spatial priors strategy to achieve interpretable geometric generation from spatial topology to building instances.<n>We construct an MLLM-powered semantic-guided generation pipeline, bridging semantic interpretation and geometric reconstruction.
- Score: 12.88788681361607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in generative modeling have substantially enhanced 3D urban generation, enabling applications in digital twins, virtual cities, and large-scale simulations. However, existing methods face two key challenges: (1) the need for large-scale 3D city assets for supervised training, which are difficult and costly to obtain, and (2) reliance on semantic or height maps, which are used exclusively for generating buildings in virtual worlds and lack connection to real-world appearance, limiting the realism and generalizability of generated cities. To address these limitations, we propose Sat2RealCity, a geometry-aware and appearance-controllable framework for 3D urban generation from real-world satellite imagery. Unlike previous city-level generation methods, Sat2RealCity builds generation upon individual building entities, enabling the use of rich priors and pretrained knowledge from 3D object generation while substantially reducing dependence on large-scale 3D city assets. Specifically, (1) we introduce the OSM-based spatial priors strategy to achieve interpretable geometric generation from spatial topology to building instances; (2) we design an appearance-guided controllable modeling mechanism for fine-grained appearance realism and style control; and (3) we construct an MLLM-powered semantic-guided generation pipeline, bridging semantic interpretation and geometric reconstruction. Extensive quantitative and qualitative experiments demonstrate that Sat2RealCity significantly surpasses existing baselines in structural consistency and appearance realism, establishing a strong foundation for real-world aligned 3D urban content creation. The code will be released soon.
Related papers
- Imagine a City: CityGenAgent for Procedural 3D City Generation [22.929582644377277]
We introduce CityGenAgent, a natural language-driven framework for hierarchical procedural generation of high-quality 3D cities.<n>Our approach decomposes city generation into two interpretable components, Block Program and Building Program.<n>Benefiting from the programs and the models' generalization, CityGenAgent supports natural language editing and manipulation.
arXiv Detail & Related papers (2026-02-05T06:36:03Z) - Yo'City: Personalized and Boundless 3D Realistic City Scene Generation via Self-Critic Expansion [28.00050174055204]
Yo'City is a novel agentic framework that enables user-customized and infinitely expandable 3D city generation.<n>To simulate continuous city evolution, Yo'City introduces a user-interactive, relationship-guided expansion mechanism.
arXiv Detail & Related papers (2025-11-24T04:02:48Z) - WorldGrow: Generating Infinite 3D World [75.81531067447203]
We tackle the challenge of generating the infinitely extendable 3D world -- large, continuous environments with coherent geometry and realistic appearance.<n>We propose WorldGrow, a hierarchical framework for unbounded 3D scene synthesis.<n>Our method features three core components: (1) a data curation pipeline that extracts high-quality scene blocks for training, making the 3D structured latent representations suitable for scene generation; (2) a 3D block inpainting mechanism that enables context-aware scene extension; and (3) a coarse-to-fine generation strategy that ensures both global layout plausibility and local geometric/textural fidelity.
arXiv Detail & Related papers (2025-10-24T17:39:52Z) - Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion [18.943643720564996]
Sat2City is a novel framework that synergizes the representational capacity of sparse voxel grids with latent diffusion models.<n>We introduce a dataset of synthesized large-scale 3D cities paired with satellite-view height maps.<n>Our framework generates detailed 3D structures from a single satellite image, achieving superior fidelity compared to existing city generation models.
arXiv Detail & Related papers (2025-07-06T14:30:08Z) - REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints [47.82928111264676]
REArtGS is a novel framework that introduces additional geometric and motion constraints to 3D Gaussian primitives.<n>It achieves high-fidelity textured surface reconstruction for given states, and enables high-fidelity surface generation for unseen states.
arXiv Detail & Related papers (2025-03-09T16:05:36Z) - Compositional Generative Model of Unbounded 4D Cities [56.36624718397362]
We propose a compositional generative model specifically tailored for generating 4D cities.<n>CityDreamer4D supports a range of downstream applications, such as instance editing, city stylization, and urban simulation.
arXiv Detail & Related papers (2025-01-15T17:59:56Z) - Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians [65.09942210464747]
Building asset creation is labor-intensive and requires specialized skills to develop design rules.<n>Recent generative models for building creation often overlook these patterns, leading to low visual fidelity and limited scalability.<n>By manipulating procedural code, we can streamline this process and generate an infinite variety of buildings.
arXiv Detail & Related papers (2024-12-10T16:45:32Z) - CityX: Controllable Procedural Content Generation for Unbounded 3D Cities [50.10101235281943]
Current generative methods fall short in either diversity, controllability, or fidelity.<n>In this work, we resort to the procedural content generation (PCG) technique for high-fidelity generation.<n>We develop a multi-agent framework to transform multi-modal instructions, including OSM, semantic maps, and satellite images, into executable programs.<n>Our method, named CityX, demonstrates its superiority in creating diverse, controllable, and realistic 3D urban scenes.
arXiv Detail & Related papers (2024-07-24T18:05:13Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - CityDreamer: Compositional Generative Model of Unbounded 3D Cities [44.203932215464214]
CityDreamer is a compositional generative model designed specifically for unbounded 3D cities.
We adopt the bird's eye view scene representation and employ a volumetric render for both instance-oriented and stuff-oriented neural fields.
CityDreamer achieves state-of-the-art performance not only in generating realistic 3D cities but also in localized editing within the generated cities.
arXiv Detail & Related papers (2023-09-01T17:57:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.