Related papers: HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative

HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative

URL: http://arxiv.org/abs/2403.02640v3
Date: Tue, 26 Mar 2024 17:14:14 GMT
Title: HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative
Authors: Cong Ma, Lei Qiao, Chengkai Zhu, Kai Liu, Zelong Kong, Qing Li, Xueqi Zhou, Yuheng Kan, Wei Wu,
Abstract summary: We constructed holographic intersections with various layouts to build a large-scale multi-sensor holographic vehicle-infrastructure cooperation dataset, called HoloVIC. Our dataset includes 3 different types of sensors (Camera, Lidar, Fisheye) and employs 4 sensor-s based on the different intersections.
Score: 23.293162454592544
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Vehicle-to-everything (V2X) is a popular topic in the field of Autonomous Driving in recent years. Vehicle-infrastructure cooperation (VIC) becomes one of the important research area. Due to the complexity of traffic conditions such as blind spots and occlusion, it greatly limits the perception capabilities of single-view roadside sensing systems. To further enhance the accuracy of roadside perception and provide better information to the vehicle side, in this paper, we constructed holographic intersections with various layouts to build a large-scale multi-sensor holographic vehicle-infrastructure cooperation dataset, called HoloVIC. Our dataset includes 3 different types of sensors (Camera, Lidar, Fisheye) and employs 4 sensor-layouts based on the different intersections. Each intersection is equipped with 6-18 sensors to capture synchronous data. While autonomous vehicles pass through these intersections for collecting VIC data. HoloVIC contains in total on 100k+ synchronous frames from different sensors. Additionally, we annotated 3D bounding boxes based on Camera, Fisheye, and Lidar. We also associate the IDs of the same objects across different devices and consecutive frames in sequence. Based on HoloVIC, we formulated four tasks to facilitate the development of related research. We also provide benchmarks for these tasks.

Related papers

AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios [45.225172917280254]
We present AGC-Drive, the first large-scale real-world dataset for Aerial-Ground Cooperative 3D perception.<n>The dataset covers 14 diverse real-world driving scenarios, including urban roundabouts, highway tunnels, and on/off ramps.<n>We provide benchmarks for two 3D perception tasks: vehicle-to-vehicle collaborative perception and vehicle-to-infrastructure collaborative perception.
arXiv Detail & Related papers (2025-06-19T14:48:43Z)
Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration [56.75198775820637]
Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes. Our dataset provides precisely aligned point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training.
arXiv Detail & Related papers (2025-02-19T23:53:00Z)
V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception [45.001209388616736]
We present V2X-Radar, the first large real-world multi-modal dataset featuring 4D Radar. The dataset comprises 20K LiDAR frames, 40K camera images, and 20K 4D Radar data, with 350K annotated bounding boxes across five categories. To facilitate diverse research domains, we establish V2X-Radar-C for cooperative perception, V2X-Radar-I for roadside perception, and V2X-Radar-V for single-vehicle perception.
arXiv Detail & Related papers (2024-11-17T04:59:00Z)
Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception [3.10770247120758]
We introduce Multi-V2X, a large-scale, multi-modal, multi-penetration-rate dataset for V2X perception. In total, our Multi-V2X dataset comprises 549k RGB frames, 146k LiDAR frames, and 4,219k annotated 3D bounding boxes. The highest possible CAV penetration rate reaches 86.21%, with up to 31 agents in communication range.
arXiv Detail & Related papers (2024-09-08T05:22:00Z)
V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception [22.3955949838171]
We propose a dataset that has a mixture of multiple vehicles and smart infrastructure simultaneously to facilitate the V2X cooperative perception development. V2X-Real is collected using two connected automated vehicles and two smart infrastructures. The whole dataset contains 33K LiDAR frames and 171K camera data with over 1.2M annotated bounding boxes of 10 categories in very challenging urban scenarios.
arXiv Detail & Related papers (2024-03-24T06:30:02Z)
TUMTraf V2X Cooperative Perception Dataset [20.907021313266128]
We propose CoopDet3D, a cooperative multi-modal fusion model, and TUMTraf-V2X, a perception dataset. Our dataset contains 2,000 labeled point clouds and 5,000 labeled images from five roadside and four onboard sensors. We show that our CoopDet3D camera-LiDAR fusion model achieves an increase of +14.36 3D mAP compared to a vehicle camera-LiDAR fusion model.
arXiv Detail & Related papers (2024-03-02T21:29:04Z)
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [87.8761593366609]
SSCBench is a benchmark that integrates scenes from widely used automotive datasets. We benchmark models using monocular, trinocular, and cloud input to assess the performance gap. We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
arXiv Detail & Related papers (2023-06-15T09:56:33Z)
V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception [49.7212681947463]
Vehicle-to-Vehicle (V2V) cooperative perception system has great potential to revolutionize the autonomous driving industry. We present V2V4Real, the first large-scale real-world multi-modal dataset for V2V perception. Our dataset covers a driving area of 410 km, comprising 20K LiDAR frames, 40K RGB frames, 240K annotated 3D bounding boxes for 5 classes, and HDMaps.
arXiv Detail & Related papers (2023-03-14T02:49:20Z)
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain. The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z)
DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection [8.681912341444901]
DAIR-V2X is the first large-scale, multi-modality, multi-view dataset from real scenarios for Vehicle-Infrastructure Cooperative Autonomous Driving. DAIR-V2X comprises 71254 LiDAR frames and 71254 Camera frames, and all frames are captured from real scenes with 3D annotations.
arXiv Detail & Related papers (2022-04-12T07:13:33Z)
LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks. We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z)
Robust 2D/3D Vehicle Parsing in CVIS [54.825777404511605]
We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS) Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters. In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation.
arXiv Detail & Related papers (2021-03-11T03:35:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.