HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative
- URL: http://arxiv.org/abs/2403.02640v3
- Date: Tue, 26 Mar 2024 17:14:14 GMT
- Title: HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative
- Authors: Cong Ma, Lei Qiao, Chengkai Zhu, Kai Liu, Zelong Kong, Qing Li, Xueqi Zhou, Yuheng Kan, Wei Wu,
- Abstract summary: We constructed holographic intersections with various layouts to build a large-scale multi-sensor holographic vehicle-infrastructure cooperation dataset, called HoloVIC.
Our dataset includes 3 different types of sensors (Camera, Lidar, Fisheye) and employs 4 sensor-s based on the different intersections.
- Score: 23.293162454592544
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Vehicle-to-everything (V2X) is a popular topic in the field of Autonomous Driving in recent years. Vehicle-infrastructure cooperation (VIC) becomes one of the important research area. Due to the complexity of traffic conditions such as blind spots and occlusion, it greatly limits the perception capabilities of single-view roadside sensing systems. To further enhance the accuracy of roadside perception and provide better information to the vehicle side, in this paper, we constructed holographic intersections with various layouts to build a large-scale multi-sensor holographic vehicle-infrastructure cooperation dataset, called HoloVIC. Our dataset includes 3 different types of sensors (Camera, Lidar, Fisheye) and employs 4 sensor-layouts based on the different intersections. Each intersection is equipped with 6-18 sensors to capture synchronous data. While autonomous vehicles pass through these intersections for collecting VIC data. HoloVIC contains in total on 100k+ synchronous frames from different sensors. Additionally, we annotated 3D bounding boxes based on Camera, Fisheye, and Lidar. We also associate the IDs of the same objects across different devices and consecutive frames in sequence. Based on HoloVIC, we formulated four tasks to facilitate the development of related research. We also provide benchmarks for these tasks.
Related papers
- Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception [3.10770247120758]
We introduce Multi-V2X, a large-scale, multi-modal, multi-penetration-rate dataset for V2X perception.
In total, our Multi-V2X dataset comprises 549k RGB frames, 146k LiDAR frames, and 4,219k annotated 3D bounding boxes.
The highest possible CAV penetration rate reaches 86.21%, with up to 31 agents in communication range.
arXiv Detail & Related papers (2024-09-08T05:22:00Z) - RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving [62.5830455357187]
In this paper, we construct a multimodal data collection platform based on 3 main types of sensors (Camera, LiDAR and Fisheye)
A large-scale multi-sensor dataset is built, named RoboSense, to facilitate near-field scene understanding.
RoboSense contains more than 133K synchronized data with 1.4M 3D bounding box and IDs in the full $360circ$ view, forming 216K trajectories across 7.6K temporal sequences.
arXiv Detail & Related papers (2024-08-28T03:17:40Z) - V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception [22.3955949838171]
We propose a dataset that has a mixture of multiple vehicles and smart infrastructure simultaneously to facilitate the V2X cooperative perception development.
V2X-Real is collected using two connected automated vehicles and two smart infrastructures.
The whole dataset contains 33K LiDAR frames and 171K camera data with over 1.2M annotated bounding boxes of 10 categories in very challenging urban scenarios.
arXiv Detail & Related papers (2024-03-24T06:30:02Z) - TUMTraf V2X Cooperative Perception Dataset [20.907021313266128]
We propose CoopDet3D, a cooperative multi-modal fusion model, and TUMTraf-V2X, a perception dataset.
Our dataset contains 2,000 labeled point clouds and 5,000 labeled images from five roadside and four onboard sensors.
We show that our CoopDet3D camera-LiDAR fusion model achieves an increase of +14.36 3D mAP compared to a vehicle camera-LiDAR fusion model.
arXiv Detail & Related papers (2024-03-02T21:29:04Z) - SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [87.8761593366609]
SSCBench is a benchmark that integrates scenes from widely used automotive datasets.
We benchmark models using monocular, trinocular, and cloud input to assess the performance gap.
We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
arXiv Detail & Related papers (2023-06-15T09:56:33Z) - V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle
Cooperative Perception [49.7212681947463]
Vehicle-to-Vehicle (V2V) cooperative perception system has great potential to revolutionize the autonomous driving industry.
We present V2V4Real, the first large-scale real-world multi-modal dataset for V2V perception.
Our dataset covers a driving area of 410 km, comprising 20K LiDAR frames, 40K RGB frames, 240K annotated 3D bounding boxes for 5 classes, and HDMaps.
arXiv Detail & Related papers (2023-03-14T02:49:20Z) - Argoverse 2: Next Generation Datasets for Self-Driving Perception and
Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain.
The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose.
The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z) - Robust 2D/3D Vehicle Parsing in CVIS [54.825777404511605]
We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS)
Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters.
In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation.
arXiv Detail & Related papers (2021-03-11T03:35:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.