Related papers: TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding

TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding

URL: http://arxiv.org/abs/2511.07007v1
Date: Mon, 10 Nov 2025 11:57:50 GMT
Title: TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding
Authors: Duc Nguyen, Yan-Ling Lai, Qilin Zhang, Prabin Gyawali, Benedikt Schwab, Olaf Wysocki, Thomas H. Kolbe,
Abstract summary: 3D semantic scene understanding remains a long-standing challenge in the 3D computer vision community.<n>We introduce TrueCity, the first urban semantic segmentation benchmark with cm-accurate annotated real-world point clouds, semantic 3D city models, and annotated simulated point clouds representing the same city.
Score: 12.573182815543978
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D semantic scene understanding remains a long-standing challenge in the 3D computer vision community. One of the key issues pertains to limited real-world annotated data to facilitate generalizable models. The common practice to tackle this issue is to simulate new data. Although synthetic datasets offer scalability and perfect labels, their designer-crafted scenes fail to capture real-world complexity and sensor noise, resulting in a synthetic-to-real domain gap. Moreover, no benchmark provides synchronized real and simulated point clouds for segmentation-oriented domain shift analysis. We introduce TrueCity, the first urban semantic segmentation benchmark with cm-accurate annotated real-world point clouds, semantic 3D city models, and annotated simulated point clouds representing the same city. TrueCity proposes segmentation classes aligned with international 3D city modeling standards, enabling consistent evaluation of synthetic-to-real gap. Our extensive experiments on common baselines quantify domain shift and highlight strategies for exploiting synthetic data to enhance real-world 3D scene understanding. We are convinced that the TrueCity dataset will foster further development of sim-to-real gap quantification and enable generalizable data-driven models. The data, code, and 3D models are available online: https://tum-gis.github.io/TrueCity/

Related papers

UrbanTwin: Synthetic LiDAR Datasets (LUMPI, V2X-Real-IC, and TUMTraf-I) [3.1508266388327324]
UrbanTwin datasets are high-fidelity, realistic replicas of three public roadside lidar datasets.<n>Each UrbanTwin dataset contains 10K frames corresponding to one of the public datasets.
arXiv Detail & Related papers (2025-09-08T15:06:02Z)
Stable-Sim2Real: Exploring Simulation of Real-Captured 3D Data with Two-Stage Depth Diffusion [16.720863475636328]
3D data simulation aims to bridge the gap between simulated and real-captured 3D data.<n>Most 3D data simulation methods inject predefined physical priors but struggle to capture the full complexity of real data.<n>This work explores a new solution path, called Stable-Sim2Real, based on a novel two-stage depth diffusion model.
arXiv Detail & Related papers (2025-07-31T12:08:16Z)
Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z)
Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving [27.088907562842902]
In autonomous driving, 3D semantic segmentation plays an important role for enabling safe navigation.<n>The complexity of collecting and annotating 3D data is a bottleneck in this developments.<n>We propose a novel approach able to generate 3D semantic scene-scale data without relying on any projection or decoupled trained multi-resolution models.
arXiv Detail & Related papers (2025-03-27T12:41:42Z)
SimVS: Simulating World Inconsistencies for Robust View Synthesis [102.83898965828621]
We present an approach for leveraging generative video models to simulate the inconsistencies in the world that can occur during capture.<n>We demonstrate that our world-simulation strategy significantly outperforms traditional augmentation methods in handling real-world scene variations.
arXiv Detail & Related papers (2024-12-10T17:35:12Z)
BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation [2.446672595462589]
We introduce the BelHouse3D dataset, a new synthetic point cloud dataset designed for 3D indoor scene semantic segmentation. This dataset is constructed using real-world references from 32 houses in Belgium, ensuring that the synthetic data closely aligns with real-world conditions.
arXiv Detail & Related papers (2024-11-20T12:09:43Z)
Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns. A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z)
HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions [68.28684509445529]
We present HandBooster, a new approach to uplift the data diversity and boost the 3D hand-mesh reconstruction performance. First, we construct versatile content-aware conditions to guide a diffusion model to produce realistic images with diverse hand appearances, poses, views, and backgrounds. Then, we design a novel condition creator based on our similarity-aware distribution sampling strategies to deliberately find novel and realistic interaction poses that are distinctive from the training set.
arXiv Detail & Related papers (2024-03-27T13:56:08Z)
All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes [6.958641426737163]
UrbanSyn is a dataset acquired through semi-procedurally generated synthetic urban driving scenarios.<n>It provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation.<n>We make UrbanSyn openly and freely accessible (www.urbansyn.org)
arXiv Detail & Related papers (2023-12-19T14:09:12Z)
Towards 3D Scene Understanding by Referring Synthetic Models [65.74211112607315]
Methods typically alleviate on-extensive annotations on real scene scans. We explore how synthetic models rely on real scene categories of synthetic features to a unified feature space. Experiments show that our method achieves the average mAP of 46.08% on the ScanNet S3DIS dataset and 55.49% by learning datasets.
arXiv Detail & Related papers (2022-03-20T13:06:15Z)
SensatUrban: Learning Semantics from Urban-Scale Photogrammetric Point Clouds [52.624157840253204]
We introduce SensatUrban, an urban-scale UAV photogrammetry point cloud dataset consisting of nearly three billion points collected from three UK cities, covering 7.6 km2. Each point in the dataset has been labelled with fine-grained semantic annotations, resulting in a dataset that is three times the size of the previous existing largest photogrammetric point cloud dataset.
arXiv Detail & Related papers (2022-01-12T14:48:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.