CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception
- URL: http://arxiv.org/abs/2304.00670v3
- Date: Sat, 23 Dec 2023 17:10:00 GMT
- Title: CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception
- Authors: Youngseok Kim, Juyeb Shin, Sanmin Kim, In-Jae Lee, Jun Won Choi,
Dongsuk Kum
- Abstract summary: We propose Camera Radar Net (CRN), a novel camera-radar fusion framework.
CRN generates semantically rich and spatially accurate bird's-eye-view (BEV) feature map for various tasks.
CRN with real-time setting operates at 20 FPS while achieving comparable performance to LiDAR detectors on nuScenes.
- Score: 20.824179713013734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous driving requires an accurate and fast 3D perception system that
includes 3D object detection, tracking, and segmentation. Although recent
low-cost camera-based approaches have shown promising results, they are
susceptible to poor illumination or bad weather conditions and have a large
localization error. Hence, fusing camera with low-cost radar, which provides
precise long-range measurement and operates reliably in all environments, is
promising but has not yet been thoroughly investigated. In this paper, we
propose Camera Radar Net (CRN), a novel camera-radar fusion framework that
generates a semantically rich and spatially accurate bird's-eye-view (BEV)
feature map for various tasks. To overcome the lack of spatial information in
an image, we transform perspective view image features to BEV with the help of
sparse but accurate radar points. We further aggregate image and radar feature
maps in BEV using multi-modal deformable attention designed to tackle the
spatial misalignment between inputs. CRN with real-time setting operates at 20
FPS while achieving comparable performance to LiDAR detectors on nuScenes, and
even outperforms at a far distance on 100m setting. Moreover, CRN with offline
setting yields 62.4% NDS, 57.5% mAP on nuScenes test set and ranks first among
all camera and camera-radar 3D object detectors.
Related papers
- A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data [7.2508100569856975]
We use the raw range-Doppler spectrum of radar data to process camera images.
We extract the corresponding features with our camera encoder-decoder architecture.
The resultant feature maps are fused with Range-Azimuth features, recovered from the RD spectrum input to perform object detection.
arXiv Detail & Related papers (2024-11-20T13:26:13Z) - RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network [34.45694077040797]
We present a radar-camera fusion 3D object detection framework called BEEVDet.
RadarBEVNet encodes sparse radar points into a dense bird's-eye-view feature.
Our method achieves state-of-the-art radar-camera fusion results in 3D object detection, BEV semantic segmentation, and 3D multi-object tracking tasks.
arXiv Detail & Related papers (2024-09-08T05:14:27Z) - Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar [62.51065633674272]
We introduce Radar Fields - a neural scene reconstruction method designed for active radar imagers.
Our approach unites an explicit, physics-informed sensor model with an implicit neural geometry and reflectance model to directly synthesize raw radar measurements.
We validate the effectiveness of the method across diverse outdoor scenarios, including urban scenes with dense vehicles and infrastructure.
arXiv Detail & Related papers (2024-05-07T20:44:48Z) - Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection.
Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor.
The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z) - Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images.
In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data.
We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z) - Vision meets mmWave Radar: 3D Object Perception Benchmark for Autonomous
Driving [30.456314610767667]
We introduce the CRUW3D dataset, including 66K synchronized and well-calibrated camera, radar, and LiDAR frames.
This kind of format can enable machine learning models to more reliable perception results after fusing the information or features between the camera and radar.
arXiv Detail & Related papers (2023-11-17T01:07:37Z) - Echoes Beyond Points: Unleashing the Power of Raw Radar Data in
Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline.
Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z) - RC-BEVFusion: A Plug-In Module for Radar-Camera Bird's Eye View Feature
Fusion [11.646949644683755]
We present RC-BEVFusion, a modular radar-camera fusion network on the BEV plane.
We show significant performance gains of up to 28% increase in the nuScenes detection score.
arXiv Detail & Related papers (2023-05-25T09:26:04Z) - CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for
Robust 3D Object Detection [12.557361522985898]
We propose a camera-radar matching network CramNet to fuse the sensor readings from camera and radar in a joint 3D space.
Our method supports training with sensor modality dropout, which leads to robust 3D object detection, even when a camera or radar sensor suddenly malfunctions on a vehicle.
arXiv Detail & Related papers (2022-10-17T17:18:47Z) - Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images [96.66271207089096]
FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes.
We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
arXiv Detail & Related papers (2022-05-27T05:42:16Z) - LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar
Fusion [52.59664614744447]
We present LiRaNet, a novel end-to-end trajectory prediction method which utilizes radar sensor information along with widely used lidar and high definition (HD) maps.
automotive radar provides rich, complementary information, allowing for longer range vehicle detection as well as instantaneous velocity measurements.
arXiv Detail & Related papers (2020-10-02T00:13:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.