CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for
Robust 3D Object Detection
- URL: http://arxiv.org/abs/2210.09267v2
- Date: Tue, 18 Oct 2022 01:46:28 GMT
- Title: CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for
Robust 3D Object Detection
- Authors: Jyh-Jing Hwang and Henrik Kretzschmar and Joshua Manela and Sean
Rafferty and Nicholas Armstrong-Crews and Tiffany Chen and Dragomir Anguelov
- Abstract summary: We propose a camera-radar matching network CramNet to fuse the sensor readings from camera and radar in a joint 3D space.
Our method supports training with sensor modality dropout, which leads to robust 3D object detection, even when a camera or radar sensor suddenly malfunctions on a vehicle.
- Score: 12.557361522985898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robust 3D object detection is critical for safe autonomous driving. Camera
and radar sensors are synergistic as they capture complementary information and
work well under different environmental conditions. Fusing camera and radar
data is challenging, however, as each of the sensors lacks information along a
perpendicular axis, that is, depth is unknown to camera and elevation is
unknown to radar. We propose the camera-radar matching network CramNet, an
efficient approach to fuse the sensor readings from camera and radar in a joint
3D space. To leverage radar range measurements for better camera depth
predictions, we propose a novel ray-constrained cross-attention mechanism that
resolves the ambiguity in the geometric correspondences between camera features
and radar features. Our method supports training with sensor modality dropout,
which leads to robust 3D object detection, even when a camera or radar sensor
suddenly malfunctions on a vehicle. We demonstrate the effectiveness of our
fusion approach through extensive experiments on the RADIATE dataset, one of
the few large-scale datasets that provide radar radio frequency imagery. A
camera-only variant of our method achieves competitive performance in monocular
3D object detection on the Waymo Open Dataset.
Related papers
- RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network [34.45694077040797]
We present a radar-camera fusion 3D object detection framework called BEEVDet.
RadarBEVNet encodes sparse radar points into a dense bird's-eye-view feature.
Our method achieves state-of-the-art radar-camera fusion results in 3D object detection, BEV semantic segmentation, and 3D multi-object tracking tasks.
arXiv Detail & Related papers (2024-09-08T05:14:27Z) - Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar [62.51065633674272]
We introduce Radar Fields - a neural scene reconstruction method designed for active radar imagers.
Our approach unites an explicit, physics-informed sensor model with an implicit neural geometry and reflectance model to directly synthesize raw radar measurements.
We validate the effectiveness of the method across diverse outdoor scenarios, including urban scenes with dense vehicles and infrastructure.
arXiv Detail & Related papers (2024-05-07T20:44:48Z) - Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection.
Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor.
The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z) - RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection [33.07575082922186]
Three-dimensional object detection is one of the key tasks in autonomous driving.
relying solely on cameras is difficult to achieve highly accurate and robust 3D object detection.
radar-camera fusion 3D object detection method in the bird's eye view (BEV)
RadarBEVNet consists of a dual-stream radar backbone and a Radar Cross-Section (RC) aware BEV encoder.
arXiv Detail & Related papers (2024-03-25T06:02:05Z) - Vision meets mmWave Radar: 3D Object Perception Benchmark for Autonomous
Driving [30.456314610767667]
We introduce the CRUW3D dataset, including 66K synchronized and well-calibrated camera, radar, and LiDAR frames.
This kind of format can enable machine learning models to more reliable perception results after fusing the information or features between the camera and radar.
arXiv Detail & Related papers (2023-11-17T01:07:37Z) - Echoes Beyond Points: Unleashing the Power of Raw Radar Data in
Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline.
Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z) - HVDetFusion: A Simple and Robust Camera-Radar Fusion Framework [10.931114142452895]
Current SOTA algorithm combines Camera and Lidar sensors, limited by the high price of Lidar.
HVDetFusion is a multi-modal detection algorithm that supports pure camera data as input for detection, but also can perform fusion input of radar data and camera data.
HVDetFusion achieves the new state-of-the-art 67.4% NDS on the challenging nuScenes test set among all camera-radar 3D object detectors.
arXiv Detail & Related papers (2023-07-21T03:08:28Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Drone Detection and Tracking in Real-Time by Fusion of Different Sensing
Modalities [66.4525391417921]
We design and evaluate a multi-sensor drone detection system.
Our solution integrates a fish-eye camera as well to monitor a wider part of the sky and steer the other cameras towards objects of interest.
The thermal camera is shown to be a feasible solution as good as the video camera, even if the camera employed here has a lower resolution.
arXiv Detail & Related papers (2022-07-05T10:00:58Z) - RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars.
We propose a new solution that exploits both LiDAR and Radar sensors for perception.
Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z) - RODNet: Radar Object Detection Using Cross-Modal Supervision [34.33920572597379]
Radar is usually more robust than the camera in severe driving scenarios.
Unlike RGB images captured by a camera, semantic information from the radar signals is noticeably difficult to extract.
We propose a deep radar object detection network (RODNet) to effectively detect objects purely from the radar frequency data.
arXiv Detail & Related papers (2020-03-03T22:33:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.