Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future
- URL: http://arxiv.org/abs/2312.03408v4
- Date: Fri, 22 Mar 2024 06:45:41 GMT
- Title: Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future
- Authors: Hongyang Li, Yang Li, Huijie Wang, Jia Zeng, Huilin Xu, Pinlong Cai, Li Chen, Junchi Yan, Feng Xu, Lu Xiong, Jingdong Wang, Futang Zhu, Chunjing Xu, Tiancai Wang, Fei Xia, Beipeng Mu, Zhihui Peng, Dahua Lin, Yu Qiao,
- Abstract summary: This review systematically assesses over seventy open-source autonomous driving datasets.
It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets.
It also delves into the scientific and technical challenges that warrant resolution.
- Score: 130.87142103774752
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem. Current autonomous driving datasets can broadly be categorized into two generations. The first-generation autonomous driving datasets are characterized by relatively simpler sensor modalities, smaller data scale, and is limited to perception-level tasks. KITTI, introduced in 2012, serves as a prominent representative of this initial wave. In contrast, the second-generation datasets exhibit heightened complexity in sensor modalities, greater data scale and diversity, and an expansion of tasks from perception to encompass prediction and control. Leading examples of the second generation include nuScenes and Waymo, introduced around 2019. This comprehensive review, conducted in collaboration with esteemed colleagues from both academia and industry, systematically assesses over seventy open-source autonomous driving datasets from domestic and international sources. It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets, the pivotal role of data engine systems, and the utilization of generative foundation models to facilitate scalable data generation. Furthermore, this review undertakes an exhaustive analysis and discourse regarding the characteristics and data scales that future third-generation autonomous driving datasets should possess. It also delves into the scientific and technical challenges that warrant resolution. These endeavors are pivotal in advancing autonomous innovation and fostering technological enhancement in critical domains. For further details, please refer to https://github.com/OpenDriveLab/DriveAGI.
Related papers
- Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey [61.39993881402787]
World models and video generation are pivotal technologies in the domain of autonomous driving.
This paper investigates the relationship between these two technologies.
By analyzing the interplay between video generation and world models, this survey identifies critical challenges and future research directions.
arXiv Detail & Related papers (2024-11-05T08:58:35Z) - SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control [59.20038082523832]
We present SubjectDrive, the first model proven to scale generative data production in a way that could continuously improve autonomous driving applications.
We develop a novel model equipped with a subject control mechanism, which allows the generative model to leverage diverse external data sources for producing varied and useful data.
arXiv Detail & Related papers (2024-03-28T14:07:13Z) - OASim: an Open and Adaptive Simulator based on Neural Rendering for
Autonomous Driving [11.682732129252118]
OASim is an open and adaptive simulator and autonomous driving data generator based on implicit neural rendering.
Data plays a core role in the algorithm closed-loop system, but collecting real-world data is expensive, time-consuming, and unsafe.
arXiv Detail & Related papers (2024-02-06T09:19:44Z) - Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of
Big Data System, Data Mining, and Closed-Loop Technologies [16.283613452235976]
Key to surmount the bottleneck lies in data-centric autonomous driving technology.
There is a lack of systematic knowledge and deep understanding regarding how to build efficient data-centric AD technology.
This article will closely focus on reviewing the state-of-the-art data-driven autonomous driving technologies.
arXiv Detail & Related papers (2024-01-23T16:28:30Z) - A Survey on Autonomous Driving Datasets: Statistics, Annotation Quality, and a Future Outlook [24.691922611156937]
We present an exhaustive study of 265 autonomous driving datasets from multiple perspectives.
We introduce a novel metric to evaluate the impact of datasets, which can also be a guide for creating new datasets.
We discuss the current challenges and the development trend of the future autonomous driving datasets.
arXiv Detail & Related papers (2024-01-02T22:35:33Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - Synthetic Datasets for Autonomous Driving: A Survey [13.287734271923565]
It is difficult for real-world datasets to keep up with the pace of changing requirements due to their expensive and time-consuming experimental and labeling costs.
More and more researchers are turning to synthetic datasets to easily generate rich and changeable data.
arXiv Detail & Related papers (2023-04-24T15:46:10Z) - GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D
LiDAR Segmentation [60.07812405063708]
3D point cloud semantic segmentation is fundamental for autonomous driving.
Most approaches in the literature neglect an important aspect, i.e., how to deal with domain shift when handling dynamic scenes.
This paper advances the state of the art in this research field.
arXiv Detail & Related papers (2022-07-20T09:06:07Z) - One Million Scenes for Autonomous Driving: ONCE Dataset [91.94189514073354]
We introduce the ONCE dataset for 3D object detection in the autonomous driving scenario.
The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available.
We reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.
arXiv Detail & Related papers (2021-06-21T12:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.