Related papers: How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection

How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection

URL: http://arxiv.org/abs/2506.13722v1
Date: Mon, 16 Jun 2025 17:27:43 GMT
Title: How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection
Authors: Kaiyuan Tan, Pavan Kumar B N, Bharatesh Chakravarthi,
Abstract summary: Event cameras are well-suited for real-time object detection at traffic intersections.<n>The development of robust event-based detection models is hindered by the limited availability of annotated real-world datasets.<n>This study offers the first quantifiable analysis of the sim-to-real gap in event-based object detection using CARLAs DVS.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Event cameras are gaining traction in traffic monitoring applications due to their low latency, high temporal resolution, and energy efficiency, which makes them well-suited for real-time object detection at traffic intersections. However, the development of robust event-based detection models is hindered by the limited availability of annotated real-world datasets. To address this, several simulation tools have been developed to generate synthetic event data. Among these, the CARLA driving simulator includes a built-in dynamic vision sensor (DVS) module that emulates event camera output. Despite its potential, the sim-to-real gap for event-based object detection remains insufficiently studied. In this work, we present a systematic evaluation of this gap by training a recurrent vision transformer model exclusively on synthetic data generated using CARLAs DVS and testing it on varying combinations of synthetic and real-world event streams. Our experiments show that models trained solely on synthetic data perform well on synthetic-heavy test sets but suffer significant performance degradation as the proportion of real-world data increases. In contrast, models trained on real-world data demonstrate stronger generalization across domains. This study offers the first quantifiable analysis of the sim-to-real gap in event-based object detection using CARLAs DVS. Our findings highlight limitations in current DVS simulation fidelity and underscore the need for improved domain adaptation techniques in neuromorphic vision for traffic monitoring.

Related papers

Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving [35.49042205415498]
We introduce SceneCrafter, a realistic, interactive, and efficient autonomous driving simulator based on 3D Gaussian Splatting (3DGS)<n>SceneCrafter efficiently generates realistic driving logs across diverse traffic scenarios.<n>It also enables robust closed-loop evaluation of end-to-end models.
arXiv Detail & Related papers (2025-03-23T15:27:43Z)
Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios [3.30184292168618]
We propose a dataset generation pipeline based on the CARLA simulator for 3D object detection on LiDAR point clouds.<n>We are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset.
arXiv Detail & Related papers (2025-02-20T22:27:42Z)
RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications [55.24463002889]
We focus on depth data synthesis and develop a range-aware RGB-D data simulation pipeline (RaSim) In particular, high-fidelity depth data is generated by imitating the imaging principle of real-world sensors. RaSim can be directly applied to real-world scenarios without any finetuning and excel at downstream RGB-D perception tasks.
arXiv Detail & Related papers (2024-04-05T08:52:32Z)
Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap [6.393953433174051]
We propose a novel perspective for addressing the real-to-simulated data gap. We conduct the first large-scale investigation into the real-to-simulated data gap in an autonomous driving setting. Our results show notable improvements in model robustness to simulated data, even improving real-world performance in some cases.
arXiv Detail & Related papers (2024-03-24T11:09:41Z)
Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks [47.07188762367792]
We present ARSim, a framework designed to enhance real multi-view image data with 3D synthetic objects of interest. We construct a simplified virtual scene using real data and strategically place 3D synthetic assets within it. The resulting augmented multi-view consistent dataset is used to train a multi-camera perception network for autonomous vehicles.
arXiv Detail & Related papers (2024-03-22T17:49:11Z)
Data-driven Camera and Lidar Simulation Models for Autonomous Driving: A Review from Generative Models to Volume Renderers [7.90336803821407]
This paper reviews the current state-of-the-art data-driven camera and Lidar simulation models and their evaluation methods.<n>It explores a spectrum of models from the novel perspective of generative models and volumes.
arXiv Detail & Related papers (2024-01-29T16:56:17Z)
Reinforcement Learning with Human Feedback for Realistic Traffic Simulation [53.85002640149283]
Key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge. This study identifies two main challenges: capturing the nuances of human preferences on realism and the unification of diverse traffic simulation models.
arXiv Detail & Related papers (2023-09-01T19:29:53Z)
Quantifying the LiDAR Sim-to-Real Domain Shift: A Detailed Investigation Using Object Detectors and Analyzing Point Clouds at Target-Level [1.1999555634662635]
LiDAR object detection algorithms based on neural networks for autonomous driving require large amounts of data for training, validation, and testing. We show that using simulated data for the training of neural networks leads to a domain shift of training and testing data due to differences in scenes, scenarios, and distributions.
arXiv Detail & Related papers (2023-03-03T12:52:01Z)
Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor. We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces. We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z)
Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data. We first train a scale-aware disparity network using both monocular real images and stereo virtual data. The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z)
Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process. The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.