A workflow for generating synthetic LiDAR datasets in simulation environments
- URL: http://arxiv.org/abs/2506.17378v1
- Date: Fri, 20 Jun 2025 17:56:15 GMT
- Title: A workflow for generating synthetic LiDAR datasets in simulation environments
- Authors: Abhishek Phadke, Shakib Mahmud Dipto, Pratip Rana,
- Abstract summary: This paper presents a simulation workflow for generating synthetic LiDAR datasets to support autonomous vehicle perception, robotics research, and sensor security analysis.<n>We integrate time-of-flight LiDAR, image sensors, and two dimensional scanners onto a simulated vehicle platform operating within an urban scenario.<n>The study examines potential security vulnerabilities in LiDAR data, such as adversarial point injection and spoofing attacks, and demonstrates how synthetic datasets can facilitate the evaluation of defense strategies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a simulation workflow for generating synthetic LiDAR datasets to support autonomous vehicle perception, robotics research, and sensor security analysis. Leveraging the CoppeliaSim simulation environment and its Python API, we integrate time-of-flight LiDAR, image sensors, and two dimensional scanners onto a simulated vehicle platform operating within an urban scenario. The workflow automates data capture, storage, and annotation across multiple formats (PCD, PLY, CSV), producing synchronized multimodal datasets with ground truth pose information. We validate the pipeline by generating large-scale point clouds and corresponding RGB and depth imagery. The study examines potential security vulnerabilities in LiDAR data, such as adversarial point injection and spoofing attacks, and demonstrates how synthetic datasets can facilitate the evaluation of defense strategies. Finally, limitations related to environmental realism, sensor noise modeling, and computational scalability are discussed, and future research directions, such as incorporating weather effects, real-world terrain models, and advanced scanner configurations, are proposed. The workflow provides a versatile, reproducible framework for generating high-fidelity synthetic LiDAR datasets to advance perception research and strengthen sensor security in autonomous systems. Documentation and examples accompany this framework; samples of animated cloud returns and image sensor data can be found at this Link.
Related papers
- How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection [0.0]
Event cameras are well-suited for real-time object detection at traffic intersections.<n>The development of robust event-based detection models is hindered by the limited availability of annotated real-world datasets.<n>This study offers the first quantifiable analysis of the sim-to-real gap in event-based object detection using CARLAs DVS.
arXiv Detail & Related papers (2025-06-16T17:27:43Z) - Evaluating the Impact of Synthetic Data on Object Detection Tasks in Autonomous Driving [0.0]
We compare 2D and 3D object detection tasks trained on real, synthetic, and mixed datasets.<n>Our findings demonstrate that the use of a combination of real and synthetic data improves the robustness and generalization of object detection models.
arXiv Detail & Related papers (2025-03-12T20:13:33Z) - Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios [3.30184292168618]
We propose a dataset generation pipeline based on the CARLA simulator for 3D object detection on LiDAR point clouds.<n>We are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset.
arXiv Detail & Related papers (2025-02-20T22:27:42Z) - From Gaming to Research: GTA V for Synthetic Data Generation for Robotics and Navigations [2.7383830691749163]
We introduce a synthetic dataset created using the virtual environment of the video game Grand Theft Auto V (GTA V)<n>We demonstrate that synthetic data derived from GTA V are qualitatively comparable to real-world data.
arXiv Detail & Related papers (2025-02-17T20:22:52Z) - XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis [84.23233209017192]
This paper presents a synthetic dataset for novel driving view synthesis evaluation.<n>It includes testing images captured by deviating from the training trajectory by $1-4$ meters.<n>We establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multicamera settings.
arXiv Detail & Related papers (2024-06-26T14:00:21Z) - SimGen: Simulator-conditioned Driving Scene Generation [50.03358485083602]
We introduce a simulator-conditioned scene generation framework called SimGen.<n>SimGen learns to generate diverse driving scenes by mixing data from the simulator and the real world.<n>It achieves superior generation quality and diversity while preserving controllability based on the text prompt and the layout pulled from a simulator.
arXiv Detail & Related papers (2024-06-13T17:58:32Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of
Adversarial Robustness of Vision Models [61.68061613161187]
This paper presents CARLA-GeAR, a tool for the automatic generation of synthetic datasets for evaluating the robustness of neural models against physical adversarial patches.
The tool is built on the CARLA simulator, using its Python API, and allows the generation of datasets for several vision tasks in the context of autonomous driving.
The paper presents an experimental study to evaluate the performance of some defense methods against such attacks, showing how the datasets generated with CARLA-GeAR might be used in future work as a benchmark for adversarial defense in the real world.
arXiv Detail & Related papers (2022-06-09T09:17:38Z) - VISTA 2.0: An Open, Data-driven Simulator for Multimodal Sensing and
Policy Learning for Autonomous Vehicles [131.2240621036954]
We present VISTA, an open source, data-driven simulator that integrates multiple types of sensors for autonomous vehicles.
Using high fidelity, real-world datasets, VISTA represents and simulates RGB cameras, 3D LiDAR, and event-based cameras.
We demonstrate the ability to train and test perception-to-control policies across each of the sensor types and showcase the power of this approach via deployment on a full scale autonomous vehicle.
arXiv Detail & Related papers (2021-11-23T18:58:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.