InfraDet3D: Multi-Modal 3D Object Detection based on Roadside
Infrastructure Camera and LiDAR Sensors
- URL: http://arxiv.org/abs/2305.00314v1
- Date: Sat, 29 Apr 2023 17:59:55 GMT
- Title: InfraDet3D: Multi-Modal 3D Object Detection based on Roadside
Infrastructure Camera and LiDAR Sensors
- Authors: Walter Zimmer, Joseph Birkner, Marcel Brucker, Huu Tung Nguyen, Stefan
Petrovski, Bohan Wang, Alois C. Knoll
- Abstract summary: We introduce InfraDet3D, a multi-modal 3D object detector for roadside infrastructure sensors.
We fuse two LiDARs using early fusion and further incorporate detections from monocular cameras to increase robustness and to detect small objects.
The perception framework is deployed on a real-world intersection that is part of the A9 Test Stretch in Munich, Germany.
- Score: 23.058209168505247
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current multi-modal object detection approaches focus on the vehicle domain
and are limited in the perception range and the processing capabilities.
Roadside sensor units (RSUs) introduce a new domain for perception systems and
leverage altitude to observe traffic. Cameras and LiDARs mounted on gantry
bridges increase the perception range and produce a full digital twin of the
traffic. In this work, we introduce InfraDet3D, a multi-modal 3D object
detector for roadside infrastructure sensors. We fuse two LiDARs using early
fusion and further incorporate detections from monocular cameras to increase
the robustness and to detect small objects. Our monocular 3D detection module
uses HD maps to ground object yaw hypotheses, improving the final perception
results. The perception framework is deployed on a real-world intersection that
is part of the A9 Test Stretch in Munich, Germany. We perform several ablation
studies and experiments and show that fusing two LiDARs with two cameras leads
to an improvement of +1.90 mAP compared to a camera-only solution. We evaluate
our results on the A9 infrastructure dataset and achieve 68.48 mAP on the test
set. The dataset and code will be available at https://a9-dataset.com to allow
the research community to further improve the perception results and make
autonomous driving safer.
Related papers
- RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network [34.45694077040797]
We present a radar-camera fusion 3D object detection framework called BEEVDet.
RadarBEVNet encodes sparse radar points into a dense bird's-eye-view feature.
Our method achieves state-of-the-art radar-camera fusion results in 3D object detection, BEV semantic segmentation, and 3D multi-object tracking tasks.
arXiv Detail & Related papers (2024-09-08T05:14:27Z) - Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection.
Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor.
The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z) - Joint object detection and re-identification for 3D obstacle
multi-camera systems [47.87501281561605]
This research paper introduces a novel modification to an object detection network that uses camera and lidar information.
It incorporates an additional branch designed for the task of re-identifying objects across adjacent cameras within the same vehicle.
The results underscore the superiority of this method over traditional Non-Maximum Suppression (NMS) techniques.
arXiv Detail & Related papers (2023-10-09T15:16:35Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Collaboration Helps Camera Overtake LiDAR in 3D Detection [49.58433319402405]
Camera-only 3D detection provides a simple solution for localizing objects in 3D space compared to LiDAR-based detection systems.
Our proposed collaborative camera-only 3D detection (CoCa3D) enables agents to share complementary information with each other through communication.
Results show that CoCa3D improves previous SOTA performances by 44.21% on DAIR-V2X, 30.60% on OPV2V+, 12.59% on CoPerception-UAVs+ for AP@70.
arXiv Detail & Related papers (2023-03-23T03:50:41Z) - CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for
Robust 3D Object Detection [12.557361522985898]
We propose a camera-radar matching network CramNet to fuse the sensor readings from camera and radar in a joint 3D space.
Our method supports training with sensor modality dropout, which leads to robust 3D object detection, even when a camera or radar sensor suddenly malfunctions on a vehicle.
arXiv Detail & Related papers (2022-10-17T17:18:47Z) - Real-Time And Robust 3D Object Detection with Roadside LiDARs [20.10416681832639]
We design a 3D object detection model that can detect traffic participants in roadside LiDARs in real-time.
Our model uses an existing 3D detector as a baseline and improves its accuracy.
We make a significant contribution with our LiDAR-based 3D detector that can be used for smart city applications.
arXiv Detail & Related papers (2022-07-11T21:33:42Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - Rope3D: TheRoadside Perception Dataset for Autonomous Driving and
Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view.
The dataset consists of 50k images and over 1.5M 3D objects in various scenes.
We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z) - Multimodal Virtual Point 3D Detection [6.61319085872973]
Lidar-based sensing drives current autonomous vehicles.
Current Lidar sensors lag two decades behind traditional color cameras in terms of resolution and cost.
We present an approach to seamlessly fuse RGB sensors into Lidar-based 3D recognition.
arXiv Detail & Related papers (2021-11-12T18:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.