A Benchmark for Vision-Centric HD Mapping by V2I Systems
- URL: http://arxiv.org/abs/2503.23963v1
- Date: Mon, 31 Mar 2025 11:24:53 GMT
- Title: A Benchmark for Vision-Centric HD Mapping by V2I Systems
- Authors: Miao Fan, Shanshan Yu, Shengtong Xu, Kun Jiang, Haoyi Xiong, Xiangzeng Liu,
- Abstract summary: We release a real-world dataset which contains collaborative camera frames from both vehicles and roadside infrastructures.<n>We present an end-to-end neural framework (i.e., V2I-HD) leveraging vision-centric V2I systems to construct vectorized maps.
- Score: 20.00132555655793
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous driving faces safety challenges due to a lack of global perspective and the semantic information of vectorized high-definition (HD) maps. Information from roadside cameras can greatly expand the map perception range through vehicle-to-infrastructure (V2I) communications. However, there is still no dataset from the real world available for the study on map vectorization onboard under the scenario of vehicle-infrastructure cooperation. To prosper the research on online HD mapping for Vehicle-Infrastructure Cooperative Autonomous Driving (VICAD), we release a real-world dataset, which contains collaborative camera frames from both vehicles and roadside infrastructures, and provides human annotations of HD map elements. We also present an end-to-end neural framework (i.e., V2I-HD) leveraging vision-centric V2I systems to construct vectorized maps. To reduce computation costs and further deploy V2I-HD on autonomous vehicles, we introduce a directionally decoupled self-attention mechanism to V2I-HD. Extensive experiments show that V2I-HD has superior performance in real-time inference speed, as tested by our real-world dataset. Abundant qualitative results also demonstrate stable and robust map construction quality with low cost in complex and various driving scenes. As a benchmark, both source codes and the dataset have been released at OneDrive for the purpose of further study.
Related papers
- Leveraging V2X for Collaborative HD Maps Construction Using Scene Graph Generation [0.0]
HD maps play a crucial role in autonomous vehicle navigation, complementing onboard perception sensors for improved accuracy and safety.<n>Traditional HD map generation relies on dedicated mapping vehicles, which are costly and fail to capture real-time infrastructure changes.<n>This paper presents HDMapLaneNet, a novel framework leveraging V2X communication and Scene Graph Generation to collaboratively construct a localized geometric layer of HD maps.
arXiv Detail & Related papers (2025-02-14T12:56:10Z) - TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - Driving with Prior Maps: Unified Vector Prior Encoding for Autonomous Vehicle Mapping [18.97422977086127]
High-Definition Maps (HD maps) are essential for the precise navigation and decision-making of autonomous vehicles.<n>Online construction of HD maps using on-board sensors has emerged as a promising solution.<n>This paper proposes the PriorDrive framework to address these limitations by harnessing the power of prior maps.
arXiv Detail & Related papers (2024-09-09T06:17:46Z) - RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments [62.5830455357187]
We setup an egocentric multi-sensor data collection platform based on 3 main types of sensors (Camera, LiDAR and Fisheye)<n>A large-scale multimodal dataset is constructed, named RoboSense, to facilitate egocentric robot perception.
arXiv Detail & Related papers (2024-08-28T03:17:40Z) - V2X-Real: a Large-Scale Dataset for Vehicle-to-Everything Cooperative Perception [22.3955949838171]
We present V2X-Real, a large-scale dataset that includes a mixture of multiple vehicles and smart infrastructure.
Our dataset contains 33K LiDAR frames and 171K camera data with over 1.2M annotated bounding boxes of 10 categories in very challenging urban scenarios.
arXiv Detail & Related papers (2024-03-24T06:30:02Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.
To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Complementing Onboard Sensors with Satellite Map: A New Perspective for
HD Map Construction [31.0701760075554]
High-definition (HD) maps play a crucial role in autonomous driving systems.
Recent methods have attempted to construct HD maps in real-time using vehicle onboard sensors.
We explore a new perspective that boosts HD map construction through the use of satellite maps to complement onboard sensors.
arXiv Detail & Related papers (2023-08-29T16:33:16Z) - V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle
Cooperative Perception [49.7212681947463]
Vehicle-to-Vehicle (V2V) cooperative perception system has great potential to revolutionize the autonomous driving industry.
We present V2V4Real, the first large-scale real-world multi-modal dataset for V2V perception.
Our dataset covers a driving area of 410 km, comprising 20K LiDAR frames, 40K RGB frames, 240K annotated 3D bounding boxes for 5 classes, and HDMaps.
arXiv Detail & Related papers (2023-03-14T02:49:20Z) - Argoverse 2: Next Generation Datasets for Self-Driving Perception and
Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain.
The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose.
The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z) - DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative
3D Object Detection [8.681912341444901]
DAIR-V2X is the first large-scale, multi-modality, multi-view dataset from real scenarios for Vehicle-Infrastructure Cooperative Autonomous Driving.
DAIR-V2X comprises 71254 LiDAR frames and 71254 Camera frames, and all frames are captured from real scenes with 3D annotations.
arXiv Detail & Related papers (2022-04-12T07:13:33Z) - HDNET: Exploiting HD Maps for 3D Object Detection [99.49035895393934]
We show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors.
We design a single stage detector that extracts geometric and semantic features from the HD maps.
As maps might not be available everywhere, we also propose a map prediction module that estimates the map on the fly from raw LiDAR data.
arXiv Detail & Related papers (2020-12-21T21:59:54Z) - Understanding Bird's-Eye View Semantic HD-Maps Using an Onboard
Monocular Camera [110.83289076967895]
We study scene understanding in the form of online estimation of semantic bird's-eye-view HD-maps using the video input from a single onboard camera.
In our experiments, we demonstrate that the considered aspects are complementary to each other for HD-map understanding.
arXiv Detail & Related papers (2020-12-05T14:39:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.