LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation
- URL: http://arxiv.org/abs/2411.14927v1
- Date: Fri, 22 Nov 2024 13:34:29 GMT
- Title: LiDAR-based End-to-end Temporal Perception for Vehicle-Infrastructure Cooperation
- Authors: Zhenwei Yang, Jilei Mao, Wenxian Yang, Yibo Ai, Yu Kong, Haibao Yu, Weidong Zhang,
- Abstract summary: We introduce LET-VIC, a LiDAR-based End-to-End Tracking framework for Vehicle-Temporal Cooperation (VIC)
LET-VIC leverages Vehicle-to-Everything (V2X) communication to enhance temporal perception by fusing spatial and temporal data from both vehicle and infrastructure sensors.
Experiments on the V2X-Seq-SPD dataset demonstrate that LET-VIC significantly outperforms baseline models, achieving at least a 13.7% improvement in mAP and a 13.1% improvement in AMOTA without considering communication delays.
- Score: 16.465037559349323
- License:
- Abstract: Temporal perception, the ability to detect and track objects over time, is critical in autonomous driving for maintaining a comprehensive understanding of dynamic environments. However, this task is hindered by significant challenges, including incomplete perception caused by occluded objects and observational blind spots, which are common in single-vehicle perception systems. To address these issues, we introduce LET-VIC, a LiDAR-based End-to-End Tracking framework for Vehicle-Infrastructure Cooperation (VIC). LET-VIC leverages Vehicle-to-Everything (V2X) communication to enhance temporal perception by fusing spatial and temporal data from both vehicle and infrastructure sensors. First, it spatially integrates Bird's Eye View (BEV) features from vehicle-side and infrastructure-side LiDAR data, creating a comprehensive view that mitigates occlusions and compensates for blind spots. Second, LET-VIC incorporates temporal context across frames, allowing the model to leverage historical data for enhanced tracking stability and accuracy. To further improve robustness, LET-VIC includes a Calibration Error Compensation (CEC) module to address sensor misalignments and ensure precise feature alignment. Experiments on the V2X-Seq-SPD dataset demonstrate that LET-VIC significantly outperforms baseline models, achieving at least a 13.7% improvement in mAP and a 13.1% improvement in AMOTA without considering communication delays. This work offers a practical solution and a new research direction for advancing temporal perception in autonomous driving through vehicle-infrastructure cooperation.
Related papers
- InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios [13.821143687548494]
This paper introduces a new 3D infrastructure-side collaborative perception dataset, abbreviated as inscope.
InScope encapsulates a 20-day capture duration with 303 tracking trajectories and 187,787 3D bounding boxes annotated by experts.
arXiv Detail & Related papers (2024-07-31T13:11:14Z) - V2I-Calib: A Novel Calibration Approach for Collaborative Vehicle and Infrastructure LiDAR Systems [19.919120489121987]
This paper introduces a novel approach to V2I calibration, leveraging spatial association information among perceived objects.
Central to this method is the innovative Overall Intersection over Union (oIoU) metric, which quantifies the correlation between targets identified by vehicle and infrastructure systems.
Our approach involves identifying common targets within the perception results of vehicle and infrastructure LiDAR systems through the construction of an affinity matrix.
arXiv Detail & Related papers (2024-07-14T13:34:00Z) - Unified End-to-End V2X Cooperative Autonomous Driving [21.631099800753795]
UniE2EV2X is a V2X-integrated end-to-end autonomous driving system that consolidates key driving modules within a unified network.
The framework employs a deformable attention-based data fusion strategy, effectively facilitating cooperation between vehicles and infrastructure.
We implement the UniE2EV2X framework on the challenging DeepAccident, a simulation dataset designed for V2X cooperative driving.
arXiv Detail & Related papers (2024-05-07T03:01:40Z) - OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving.
Existing approaches in this field often assume precise and complete observational data.
We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z) - MSight: An Edge-Cloud Infrastructure-based Perception System for
Connected Automated Vehicles [58.461077944514564]
This paper presents MSight, a cutting-edge roadside perception system specifically designed for automated vehicles.
MSight offers real-time vehicle detection, localization, tracking, and short-term trajectory prediction.
Evaluations underscore the system's capability to uphold lane-level accuracy with minimal latency.
arXiv Detail & Related papers (2023-10-08T21:32:30Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts,
Datasets and Metrics [77.34726150561087]
This work aims to carry out a study on the current scenario of camera and radar-based perception for ADAS and autonomous vehicles.
Concepts and characteristics related to both sensors, as well as to their fusion, are presented.
We give an overview of the Deep Learning-based detection and segmentation tasks, and the main datasets, metrics, challenges, and open questions in vehicle perception.
arXiv Detail & Related papers (2023-03-08T00:48:32Z) - DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative
3D Object Detection [8.681912341444901]
DAIR-V2X is the first large-scale, multi-modality, multi-view dataset from real scenarios for Vehicle-Infrastructure Cooperative Autonomous Driving.
DAIR-V2X comprises 71254 LiDAR frames and 71254 Camera frames, and all frames are captured from real scenes with 3D annotations.
arXiv Detail & Related papers (2022-04-12T07:13:33Z) - Multi-Stream Attention Learning for Monocular Vehicle Velocity and
Inter-Vehicle Distance Estimation [25.103483428654375]
Vehicle velocity and inter-vehicle distance estimation are essential for ADAS (Advanced driver-assistance systems) and autonomous vehicles.
Recent studies focus on using a low-cost monocular camera to perceive the environment around the vehicle in a data-driven fashion.
MSANet is proposed to extract different aspects of features, e.g., spatial and contextual features, for joint vehicle velocity and inter-vehicle distance estimation.
arXiv Detail & Related papers (2021-10-22T06:14:12Z) - Efficient and Robust LiDAR-Based End-to-End Navigation [132.52661670308606]
We present an efficient and robust LiDAR-based end-to-end navigation framework.
We propose Fast-LiDARNet that is based on sparse convolution kernel optimization and hardware-aware model design.
We then propose Hybrid Evidential Fusion that directly estimates the uncertainty of the prediction from only a single forward pass.
arXiv Detail & Related papers (2021-05-20T17:52:37Z) - LIBRE: The Multiple 3D LiDAR Dataset [54.25307983677663]
We present LIBRE: LiDAR Benchmarking and Reference, a first-of-its-kind dataset featuring 10 different LiDAR sensors.
LIBRE will contribute to the research community to provide a means for a fair comparison of currently available LiDARs.
It will also facilitate the improvement of existing self-driving vehicles and robotics-related software.
arXiv Detail & Related papers (2020-03-13T06:17:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.