Quantifying the LiDAR Sim-to-Real Domain Shift: A Detailed Investigation
Using Object Detectors and Analyzing Point Clouds at Target-Level
- URL: http://arxiv.org/abs/2303.01899v1
- Date: Fri, 3 Mar 2023 12:52:01 GMT
- Title: Quantifying the LiDAR Sim-to-Real Domain Shift: A Detailed Investigation
Using Object Detectors and Analyzing Point Clouds at Target-Level
- Authors: Sebastian Huch, Luca Scalerandi, Esteban Rivera, Markus Lienkamp
- Abstract summary: LiDAR object detection algorithms based on neural networks for autonomous driving require large amounts of data for training, validation, and testing.
We show that using simulated data for the training of neural networks leads to a domain shift of training and testing data due to differences in scenes, scenarios, and distributions.
- Score: 1.1999555634662635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR object detection algorithms based on neural networks for autonomous
driving require large amounts of data for training, validation, and testing. As
real-world data collection and labeling are time-consuming and expensive,
simulation-based synthetic data generation is a viable alternative. However,
using simulated data for the training of neural networks leads to a domain
shift of training and testing data due to differences in scenes, scenarios, and
distributions. In this work, we quantify the sim-to-real domain shift by means
of LiDAR object detectors trained with a new scenario-identical real-world and
simulated dataset. In addition, we answer the questions of how well the
simulated data resembles the real-world data and how well object detectors
trained on simulated data perform on real-world data. Further, we analyze point
clouds at the target-level by comparing real-world and simulated point clouds
within the 3D bounding boxes of the targets. Our experiments show that a
significant sim-to-real domain shift exists even for our scenario-identical
datasets. This domain shift amounts to an average precision reduction of around
14 % for object detectors trained with simulated data. Additional experiments
reveal that this domain shift can be lowered by introducing a simple noise
model in simulation. We further show that a simple downsampling method to model
real-world physics does not influence the performance of the object detectors.
Related papers
- Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - Diffusion Dataset Generation: Towards Closing the Sim2Real Gap for
Pedestrian Detection [0.11470070927586014]
We propose a novel method of synthetic data creation meant to close the sim2real gap for the pedestrian detection task.
Our method uses a diffusion-based architecture to learn a real-world distribution which, once trained, is used to generate datasets.
We show that training on a combination of generated and simulated data increases average precision by as much as 27.3% for pedestrian detection models in real-world data.
arXiv Detail & Related papers (2023-05-16T12:33:51Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - PCGen: Point Cloud Generator for LiDAR Simulation [10.692184635629792]
Existing methods generate data which are more noisy and complete than the real point clouds.
We propose FPA raycasting and surrogate model raydrop.
With minimal training data, the surrogate model can generalize to different geographies and scenes.
Results show that object detection models trained by simulation data can achieve similar result as the real data trained model.
arXiv Detail & Related papers (2022-10-17T04:13:21Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - SimuShips -- A High Resolution Simulation Dataset for Ship Detection
with Precise Annotations [0.0]
State-of-the-art obstacle detection algorithms are based on convolutional neural networks (CNNs)
SimuShips is a publicly available simulation-based dataset for maritime environments.
arXiv Detail & Related papers (2022-09-22T07:33:31Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - DiffCloud: Real-to-Sim from Point Clouds with Differentiable Simulation
and Rendering of Deformable Objects [18.266002992029716]
Research in manipulation of deformable objects is typically conducted on a limited range of scenarios.
Realistic simulators with support for various types of deformations and interactions have the potential to speed up experimentation.
For highly deformable objects it is challenging to align the output of a simulator with the behavior of real objects.
arXiv Detail & Related papers (2022-04-07T00:45:26Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - MLReal: Bridging the gap between training on synthetic data and real
data applications in machine learning [1.9852463786440129]
We describe a novel approach to enhance supervised training on synthetic data with real data features.
In the training stage, the input data are from the synthetic domain and the auto-correlated data are from the real domain.
In the inference/application stage, the input data are from the real subset domain and the mean of the autocorrelated sections are from the synthetic data subset domain.
arXiv Detail & Related papers (2021-09-11T14:43:34Z) - SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data.
We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.