UrbanTwin: Building High-Fidelity Digital Twins for Sim2Real LiDAR Perception and Evaluation
- URL: http://arxiv.org/abs/2509.02903v2
- Date: Tue, 14 Oct 2025 16:28:16 GMT
- Title: UrbanTwin: Building High-Fidelity Digital Twins for Sim2Real LiDAR Perception and Evaluation
- Authors: Muhammad Shahbaz, Shaurya Agarwal,
- Abstract summary: This tutorial introduces a reproducible workflow for building high-fidelity digital twins (HiFi DTs) to generate realistic synthetic datasets.<n>We outline practical steps for modeling static geometry, road infrastructure, and dynamic traffic using open-source resources such as satellite imagery, OpenStreetMap, and sensor specifications.<n>The resulting environments support scalable and cost-effective data generation for robust Sim2Real learning.
- Score: 3.1508266388327324
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: LiDAR-based perception in intelligent transportation systems (ITS) relies on deep neural networks trained with large-scale labeled datasets. However, creating such datasets is expensive, time-consuming, and labor-intensive, limiting the scalability of perception systems. Sim2Real learning offers a scalable alternative, but its success depends on the simulation's fidelity to real-world environments, dynamics, and sensors. This tutorial introduces a reproducible workflow for building high-fidelity digital twins (HiFi DTs) to generate realistic synthetic datasets. We outline practical steps for modeling static geometry, road infrastructure, and dynamic traffic using open-source resources such as satellite imagery, OpenStreetMap, and sensor specifications. The resulting environments support scalable and cost-effective data generation for robust Sim2Real learning. Using this workflow, we have released three synthetic LiDAR datasets, namely UT-LUMPI, UT-V2X-Real, and UT-TUMTraf-I, which closely replicate real locations and outperform real-data-trained baselines in perception tasks. This guide enables broader adoption of HiFi DTs in ITS research and deployment.
Related papers
- UrbanTwin: Synthetic LiDAR Datasets (LUMPI, V2X-Real-IC, and TUMTraf-I) [3.1508266388327324]
UrbanTwin datasets are high-fidelity, realistic replicas of three public roadside lidar datasets.<n>Each UrbanTwin dataset contains 10K frames corresponding to one of the public datasets.
arXiv Detail & Related papers (2025-09-08T15:06:02Z) - Simulation Priors for Data-Efficient Deep Learning [56.525770511247934]
SimPEL is a method that efficiently combines first-principles models with data-driven learning.<n>We evaluate SimPEL on diverse systems, including biological, agricultural, and robotic domains.<n>For decision-making, we demonstrate that SimPEL bridges the sim-to-real gap in model-based reinforcement learning.
arXiv Detail & Related papers (2025-09-06T14:36:41Z) - High-Fidelity Digital Twins for Bridging the Sim2Real Gap in LiDAR-Based ITS Perception [3.1508266388327324]
This paper proposes a high-fidelity digital twin (HiFi DT) framework that incorporates real-world background geometry, lane-level road topology, and sensor-specific specifications and placement.<n>Experiments show that the DT-trained model outperforms the equivalent model trained on real data by 4.8%.
arXiv Detail & Related papers (2025-09-03T00:12:58Z) - How to Bridge the Sim-to-Real Gap in Digital Twin-Aided Telecommunication Networks [30.858857240474077]
Training effective artificial intelligence models for telecommunications is challenging due to the scarcity of deployment-specific data.<n>Real data collection is expensive, and available datasets often fail to capture the unique operational conditions and contextual variability of the network environment.<n>Digital twinning provides a potential solution to this problem, as simulators tailored to the current network deployment can generate site-specific data to augment the available training datasets.
arXiv Detail & Related papers (2025-07-09T17:27:51Z) - A workflow for generating synthetic LiDAR datasets in simulation environments [0.0]
This paper presents a simulation workflow for generating synthetic LiDAR datasets to support autonomous vehicle perception, robotics research, and sensor security analysis.<n>We integrate time-of-flight LiDAR, image sensors, and two dimensional scanners onto a simulated vehicle platform operating within an urban scenario.<n>The study examines potential security vulnerabilities in LiDAR data, such as adversarial point injection and spoofing attacks, and demonstrates how synthetic datasets can facilitate the evaluation of defense strategies.
arXiv Detail & Related papers (2025-06-20T17:56:15Z) - A large-scale, physically-based synthetic dataset for satellite pose estimation [0.0]
This paper introduces the DLVS3-HST-V1 dataset, which focuses on the Hubble Space Telescope (HST) as a complex, articulated target.<n>The dataset is generated using advanced real-time and offline rendering technologies, integrating high-fidelity 3D models, dynamic lighting, and physically accurate material properties.<n>The pipeline supports the creation of large-scale, richly annotated image sets with ground-truth 6-DoF pose and keypoint data, semantic segmentation, depth, and normal maps.
arXiv Detail & Related papers (2025-06-15T09:24:32Z) - TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset [90.97440987655084]
Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources.<n>To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN.<n>This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 $m2$ and currently 767 GB of data.
arXiv Detail & Related papers (2025-05-12T09:48:32Z) - MBDS: A Multi-Body Dynamics Simulation Dataset for Graph Networks Simulators [4.5353840616537555]
Graph Network Simulators (GNS) have emerged as the leading method for modeling physical phenomena.
We have constructed a high-quality physical simulation dataset encompassing 1D, 2D, and 3D scenes.
A key feature of our dataset is the inclusion of precise multi-body dynamics, facilitating a more realistic simulation of the physical world.
arXiv Detail & Related papers (2024-10-04T03:03:06Z) - Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning [50.332027356848094]
AI-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control.
The mapping between context and AI model parameters is ideally done in a zero-shot fashion.
This paper introduces a general methodology for the online optimization of AMS mappings.
arXiv Detail & Related papers (2024-06-22T11:17:50Z) - Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks.
We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks.
In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z) - RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications [55.24463002889]
We focus on depth data synthesis and develop a range-aware RGB-D data simulation pipeline (RaSim)
In particular, high-fidelity depth data is generated by imitating the imaging principle of real-world sensors.
RaSim can be directly applied to real-world scenarios without any finetuning and excel at downstream RGB-D perception tasks.
arXiv Detail & Related papers (2024-04-05T08:52:32Z) - Reinforcement Learning with Human Feedback for Realistic Traffic
Simulation [53.85002640149283]
Key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge.
This study identifies two main challenges: capturing the nuances of human preferences on realism and the unification of diverse traffic simulation models.
arXiv Detail & Related papers (2023-09-01T19:29:53Z) - The Devil in the Details: Simple and Effective Optical Flow Synthetic
Data Generation [19.945859289278534]
We show that the required characteristics in an optical flow dataset are rather simple and present a simpler synthetic data generation method.
With 2D motion-based datasets, we systematically analyze the simplest yet critical factors for generating synthetic datasets.
arXiv Detail & Related papers (2023-08-14T18:01:45Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.