SELMA: SEmantic Large-scale Multimodal Acquisitions in Variable Weather,
Daytime and Viewpoints
- URL: http://arxiv.org/abs/2204.09788v1
- Date: Wed, 20 Apr 2022 21:22:56 GMT
- Title: SELMA: SEmantic Large-scale Multimodal Acquisitions in Variable Weather,
Daytime and Viewpoints
- Authors: Paolo Testolina and Francesco Barbato, Umberto Michieli, Marco
Giordani, Pietro Zanuttigh, Michele Zorzi
- Abstract summary: We introduce SELMA, a novel dataset for semantic segmentation.
It contains more than 30K unique waypoints acquired from 24 different sensors including RGB, depth, semantic cameras and LiDARs.
It is based on CARLA, an open-source simulator for generating synthetic data in autonomous driving scenarios.
- Score: 36.57734409668748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate scene understanding from multiple sensors mounted on cars is a key
requirement for autonomous driving systems. Nowadays, this task is mainly
performed through data-hungry deep learning techniques that need very large
amounts of data to be trained. Due to the high cost of performing segmentation
labeling, many synthetic datasets have been proposed. However, most of them
miss the multi-sensor nature of the data, and do not capture the significant
changes introduced by the variation of daytime and weather conditions. To fill
these gaps, we introduce SELMA, a novel synthetic dataset for semantic
segmentation that contains more than 30K unique waypoints acquired from 24
different sensors including RGB, depth, semantic cameras and LiDARs, in 27
different atmospheric and daytime conditions, for a total of more than 20M
samples. SELMA is based on CARLA, an open-source simulator for generating
synthetic data in autonomous driving scenarios, that we modified to increase
the variability and the diversity in the scenes and class sets, and to align it
with other benchmark datasets. As shown by the experimental evaluation, SELMA
allows the efficient training of standard and multi-modal deep learning
architectures, and achieves remarkable results on real-world data. SELMA is
free and publicly available, thus supporting open science and research.
Related papers
- SCOPE: A Synthetic Multi-Modal Dataset for Collective Perception Including Physical-Correct Weather Conditions [0.5026434955540995]
SCOPE is the first synthetic multi-modal dataset that incorporates realistic camera and LiDAR models as well as parameterized and physically accurate weather simulations.
The dataset contains 17,600 frames from over 40 diverse scenarios with up to 24 collaborative agents, infrastructure sensors, and passive traffic, including cyclists and pedestrians.
arXiv Detail & Related papers (2024-08-06T09:35:50Z) - SCaRL- A Synthetic Multi-Modal Dataset for Autonomous Driving [0.0]
We present a novel synthetically generated multi-modal dataset, SCaRL, to enable the training and validation of autonomous driving solutions.
SCaRL is a large dataset based on the CARLA Simulator, which provides data for diverse, dynamic scenarios and traffic conditions.
arXiv Detail & Related papers (2024-05-27T10:31:26Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Shared Manifold Learning Using a Triplet Network for Multiple Sensor
Translation and Fusion with Missing Data [2.452410403088629]
We propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to align data from different sensors into a shared and discriminative manifold.
The proposed architecture uses a multimodal triplet autoencoder to cluster the latent space in such a way that samples of the same classes from each heterogeneous modality are mapped close to each other.
arXiv Detail & Related papers (2022-10-25T20:22:09Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous
Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories.
To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes.
We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - IDDA: a large-scale multi-domain dataset for autonomous driving [16.101248613062292]
This paper contributes a new large scale, synthetic dataset for semantic segmentation with more than 100 different source visual domains.
The dataset has been created to explicitly address the challenges of domain shift between training and test data in various weather and view point conditions.
arXiv Detail & Related papers (2020-04-17T15:22:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.