Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of
Autonomous Driving
- URL: http://arxiv.org/abs/2109.11615v1
- Date: Thu, 23 Sep 2021 19:41:02 GMT
- Title: Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of
Autonomous Driving
- Authors: Yunshuang Yuan, Hao Cheng, Monika Sester
- Abstract summary: We propose an efficient keypoints-based deep feature fusion framework, called FPV-RCNN, for collective perception.
Compared to a bird's-eye view (BEV) keypoints feature fusion, FPV-RCNN achieves improved detection accuracy by about 14%.
Our method also significantly decreases the CPM size to less than 0.3KB, which is about 50 times smaller than the BEV feature map sharing used in previous works.
- Score: 2.6543018470131283
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sharing collective perception messages (CPM) between vehicles is investigated
to decrease occlusions, so as to improve perception accuracy and safety of
autonomous driving. However, highly accurate data sharing and low communication
overhead is a big challenge for collective perception, especially when
real-time communication is required among connected and automated vehicles. In
this paper, we propose an efficient and effective keypoints-based deep feature
fusion framework, called FPV-RCNN, for collective perception, which is built on
top of the 3D object detector PV-RCNN. We introduce a bounding box proposal
matching module and a keypoints selection strategy to compress the CPM size and
solve the multi-vehicle data fusion problem. Compared to a bird's-eye view
(BEV) keypoints feature fusion, FPV-RCNN achieves improved detection accuracy
by about 14% at a high evaluation criterion (IoU 0.7) on a synthetic dataset
COMAP dedicated to collective perception. Also, its performance is comparable
to two raw data fusion baselines that have no data loss in sharing. Moreover,
our method also significantly decreases the CPM size to less than 0.3KB, which
is about 50 times smaller than the BEV feature map sharing used in previous
works. Even with a further decreased number of CPM feature channels, i.e., from
128 to 32, the detection performance only drops about 1%. The code of our
method is available at https://github.com/YuanYunshuang/FPV_RCNN.
Related papers
- Channel-Aware Throughput Maximization for Cooperative Data Fusion in CAV [17.703608985129026]
Connected and autonomous vehicles (CAVs) have garnered significant attention due to their extended perception range and enhanced sensing coverage.
To address challenges such as blind spots and obstructions, CAVs employ vehicle-to-vehicle communications to aggregate data from surrounding vehicles.
We propose a channel-aware throughput approach to facilitate CAV data fusion, leveraging a self-supervised autoencoder for adaptive data compression.
arXiv Detail & Related papers (2024-10-06T00:43:46Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - V2X-PC: Vehicle-to-everything Collaborative Perception via Point Cluster [58.79477191603844]
We introduce a new message unit, namely point cluster, to represent the scene sparsely with a combination of low-level structure information and high-level semantic information.
This framework includes a Point Cluster Packing (PCP) module to keep object feature and manage bandwidth.
Experiments on two widely recognized collaborative perception benchmarks showcase the superior performance of our method compared to the previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-25T11:24:02Z) - SmartCooper: Vehicular Collaborative Perception with Adaptive Fusion and
Judger Mechanism [23.824400533836535]
We introduce SmartCooper, an adaptive collaborative perception framework that incorporates communication optimization and a judger mechanism.
Our results demonstrate a substantial reduction in communication costs by 23.10% compared to the non-judger scheme.
arXiv Detail & Related papers (2024-02-01T04:15:39Z) - V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric
Heterogenous Distillation Network [13.248981195106069]
We propose a multi-view vehicle-road cooperation perception system, vehicle-to-everything cooperative perception (V2X-AHD)
The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study.
arXiv Detail & Related papers (2023-10-10T13:12:03Z) - CoBEVFusion: Cooperative Perception with LiDAR-Camera Bird's-Eye View
Fusion [0.0]
Recent approaches in cooperative perception only share single sensor information such as cameras or LiDAR.
We present a framework, called CoBEVFusion, that fuses LiDAR and camera data to create a Bird's-Eye View (BEV) representation.
Our framework was evaluated on the cooperative perception dataset OPV2V for two perception tasks: BEV semantic segmentation and 3D object detection.
arXiv Detail & Related papers (2023-10-09T17:52:26Z) - Collaboration Helps Camera Overtake LiDAR in 3D Detection [49.58433319402405]
Camera-only 3D detection provides a simple solution for localizing objects in 3D space compared to LiDAR-based detection systems.
Our proposed collaborative camera-only 3D detection (CoCa3D) enables agents to share complementary information with each other through communication.
Results show that CoCa3D improves previous SOTA performances by 44.21% on DAIR-V2X, 30.60% on OPV2V+, 12.59% on CoPerception-UAVs+ for AP@70.
arXiv Detail & Related papers (2023-03-23T03:50:41Z) - OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with
Vehicle-to-Vehicle Communication [13.633468133727]
We present the first large-scale open simulated dataset for Vehicle-to-Vehicle perception.
It contains over 70 interesting scenes, 11,464 frames, and 232,913 annotated 3D vehicle bounding boxes.
arXiv Detail & Related papers (2021-09-16T00:52:41Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - DecAug: Augmenting HOI Detection via Decomposition [54.65572599920679]
Current algorithms suffer from insufficient training samples and category imbalance within datasets.
We propose an efficient and effective data augmentation method called DecAug for HOI detection.
Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements on V-COCO and HICODET dataset.
arXiv Detail & Related papers (2020-10-02T13:59:05Z) - Reinforcement Learning Based Vehicle-cell Association Algorithm for
Highly Mobile Millimeter Wave Communication [53.47785498477648]
This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks.
We first formulate the user state (VU) problem as a discrete non-vehicle association optimization problem.
The proposed solution achieves up to 15% gains in terms sum of user complexity and 20% reduction in VUE compared to several baseline designs.
arXiv Detail & Related papers (2020-01-22T08:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.