Related papers: ClusterFusion: Leveraging Radar Spatial Features for Radar-Camera 3D Object Detection in Autonomous Vehicles

ClusterFusion: Leveraging Radar Spatial Features for Radar-Camera 3D Object Detection in Autonomous Vehicles

URL: http://arxiv.org/abs/2309.03734v2
Date: Sun, 5 Nov 2023 11:24:46 GMT
Title: ClusterFusion: Leveraging Radar Spatial Features for Radar-Camera 3D Object Detection in Autonomous Vehicles
Authors: Irfan Tito Kurniawan and Bambang Riyanto Trilaksono
Abstract summary: Deep learning radar-camera 3D object detection methods may reliably produce accurate detections even in low-visibility conditions. Recent radar-camera methods commonly perform feature-level fusion which often involves projecting the radar points onto the same plane as the image features and fusing the extracted features from both modalities. We proposed ClusterFusion, an architecture that leverages the local spatial features of the radar point cloud by clustering the point cloud and performing feature extraction directly on the point cloud clusters.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Thanks to the complementary nature of millimeter wave radar and camera, deep learning-based radar-camera 3D object detection methods may reliably produce accurate detections even in low-visibility conditions. This makes them preferable to use in autonomous vehicles' perception systems, especially as the combined cost of both sensors is cheaper than the cost of a lidar. Recent radar-camera methods commonly perform feature-level fusion which often involves projecting the radar points onto the same plane as the image features and fusing the extracted features from both modalities. While performing fusion on the image plane is generally simpler and faster, projecting radar points onto the image plane flattens the depth dimension of the point cloud which might lead to information loss and makes extracting the spatial features of the point cloud harder. We proposed ClusterFusion, an architecture that leverages the local spatial features of the radar point cloud by clustering the point cloud and performing feature extraction directly on the point cloud clusters before projecting the features onto the image plane. ClusterFusion achieved the state-of-the-art performance among all radar-monocular camera methods on the test slice of the nuScenes dataset with 48.7% nuScenes detection score (NDS). We also investigated the performance of different radar feature extraction strategies on point cloud clusters: a handcrafted strategy, a learning-based strategy, and a combination of both, and found that the handcrafted strategy yielded the best performance. The main goal of this work is to explore the use of radar's local spatial and point-wise features by extracting them directly from radar point cloud clusters for a radar-monocular camera 3D object detection method that performs cross-modal feature fusion on the image plane.

Related papers

Revisiting Radar Camera Alignment by Contrastive Learning for 3D Object Detection [31.69508809666884]
3D object detection algorithms based on radar and camera fusion have shown excellent performance. We propose a new alignment model called Radar Camera Alignment (RCAlign) Specifically, we design a Dual-Route Alignment (DRA) module based on contrastive learning to align and fuse the features between radar and camera. Considering the sparsity of radar BEV features, a Radar Feature Enhancement (RFE) module is proposed to improve the densification of radar BEV features.
arXiv Detail & Related papers (2025-04-23T02:41:43Z)
RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance. Radar suffers from noise and positional ambiguity. We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z)
GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling [7.90238039959534]
Existing algorithms process radar data by projecting 3D points onto the image plane for pixel-level feature extraction. We propose GET-UP, leveraging attention-enhanced Graph Neural Networks (GNN) to exchange and aggregate both 2D and 3D information from radar data. We benchmark our proposed GET-UP on the nuScenes dataset, achieving state-of-the-art performance with a 15.3% and 14.7% improvement in MAE and RMSE over the previously best-performing model.
arXiv Detail & Related papers (2024-09-02T14:15:09Z)
RadarPillars: Efficient Object Detection from 4D Radar Point Clouds [42.9356088038035]
We present RadarPillars, a pillar-based object detection network. By decomposing radial velocity data, RadarPillars significantly outperform state-of-the-art detection results on the View-of-Delft dataset. This comes at a significantly reduced parameter count, surpassing existing methods in terms of efficiency and enabling real-time performance on edge devices.
arXiv Detail & Related papers (2024-08-09T12:13:38Z)
Echoes Beyond Points: Unleashing the Power of Raw Radar Data in Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline. Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z)
Multi-Task Cross-Modality Attention-Fusion for 2D Object Detection [6.388430091498446]
We propose two new radar preprocessing techniques to better align radar and camera data. We also introduce a Multi-Task Cross-Modality Attention-Fusion Network (MCAF-Net) for object detection. Our approach outperforms current state-of-the-art radar-camera fusion-based object detectors in the nuScenes dataset.
arXiv Detail & Related papers (2023-07-17T09:26:13Z)
ROFusion: Efficient Object Detection using Hybrid Point-wise Radar-Optical Fusion [14.419658061805507]
We propose a hybrid point-wise Radar-Optical fusion approach for object detection in autonomous driving scenarios. The framework benefits from dense contextual information from both the range-doppler spectrum and images which are integrated to learn a multi-modal feature representation.
arXiv Detail & Related papers (2023-07-17T04:25:46Z)
Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection [78.59426158981108]
We introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects. We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects.
arXiv Detail & Related papers (2023-06-02T10:57:41Z)
Semantic Segmentation of Radar Detections using Convolutions on Point Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds. We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds. Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z)
MVFusion: Multi-View 3D Object Detection with Semantic-aligned Radar and Camera Fusion [6.639648061168067]
Multi-view radar-camera fused 3D object detection provides a farther detection range and more helpful features for autonomous driving. Current radar-camera fusion methods deliver kinds of designs to fuse radar information with camera data. We present MVFusion, a novel Multi-View radar-camera Fusion method to achieve semantic-aligned radar features.
arXiv Detail & Related papers (2023-02-21T08:25:50Z)
Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images [96.66271207089096]
FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes. We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
arXiv Detail & Related papers (2022-05-27T05:42:16Z)
Depth Estimation from Monocular Images and Sparse Radar Data [93.70524512061318]
In this paper, we explore the possibility of achieving a more accurate depth estimation by fusing monocular images and Radar points using a deep neural network. We find that the noise existing in Radar measurements is one of the main key reasons that prevents one from applying the existing fusion methods. The experiments are conducted on the nuScenes dataset, which is one of the first datasets which features Camera, Radar, and LiDAR recordings in diverse scenes and weather conditions.
arXiv Detail & Related papers (2020-09-30T19:01:33Z)
RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars. We propose a new solution that exploits both LiDAR and Radar sensors for perception. Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.