GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling
- URL: http://arxiv.org/abs/2409.02720v2
- Date: Sun, 8 Sep 2024 18:43:06 GMT
- Title: GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling
- Authors: Huawei Sun, Zixu Wang, Hao Feng, Julius Ott, Lorenzo Servadei, Robert Wille,
- Abstract summary: Existing algorithms process radar data by projecting 3D points onto the image plane for pixel-level feature extraction.
We propose GET-UP, leveraging attention-enhanced Graph Neural Networks (GNN) to exchange and aggregate both 2D and 3D information from radar data.
We benchmark our proposed GET-UP on the nuScenes dataset, achieving state-of-the-art performance with a 15.3% and 14.7% improvement in MAE and RMSE over the previously best-performing model.
- Score: 7.90238039959534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth estimation plays a pivotal role in autonomous driving, facilitating a comprehensive understanding of the vehicle's 3D surroundings. Radar, with its robustness to adverse weather conditions and capability to measure distances, has drawn significant interest for radar-camera depth estimation. However, existing algorithms process the inherently noisy and sparse radar data by projecting 3D points onto the image plane for pixel-level feature extraction, overlooking the valuable geometric information contained within the radar point cloud. To address this gap, we propose GET-UP, leveraging attention-enhanced Graph Neural Networks (GNN) to exchange and aggregate both 2D and 3D information from radar data. This approach effectively enriches the feature representation by incorporating spatial relationships compared to traditional methods that rely only on 2D feature extraction. Furthermore, we incorporate a point cloud upsampling task to densify the radar point cloud, rectify point positions, and derive additional 3D features under the guidance of lidar data. Finally, we fuse radar and camera features during the decoding phase for depth estimation. We benchmark our proposed GET-UP on the nuScenes dataset, achieving state-of-the-art performance with a 15.3% and 14.7% improvement in MAE and RMSE over the previously best-performing model. Code: https://github.com/harborsarah/GET-UP
Related papers
- RadarPillars: Efficient Object Detection from 4D Radar Point Clouds [42.9356088038035]
We present RadarPillars, a pillar-based object detection network.
By decomposing radial velocity data, RadarPillars significantly outperform state-of-the-art detection results on the View-of-Delft dataset.
This comes at a significantly reduced parameter count, surpassing existing methods in terms of efficiency and enabling real-time performance on edge devices.
arXiv Detail & Related papers (2024-08-09T12:13:38Z) - CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation [6.9404362058736995]
This paper introduces a two-stage, end-to-end trainable Confidence-aware Fusion Net (CaFNet) for dense depth estimation.
The first stage addresses radar-specific challenges, such as ambiguous elevation and noisy measurements.
For the final depth estimation, we innovate a confidence-aware gated fusion mechanism to integrate radar and image features effectively.
arXiv Detail & Related papers (2024-06-30T13:39:29Z) - RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar [15.776076554141687]
3D occupancy-based perception pipeline has significantly advanced autonomous driving.
Current methods rely on LiDAR or camera inputs for 3D occupancy prediction.
We introduce a novel approach that utilizes 4D imaging radar sensors for 3D occupancy prediction.
arXiv Detail & Related papers (2024-05-22T21:48:17Z) - Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications [6.237187007098249]
This work introduces a learning-based approach to infer the height of radar points associated with 3D objects.
The average radar absolute height error decreases from 1.69 to 0.25 meters compared to the state-of-the-art height extension method.
arXiv Detail & Related papers (2024-04-09T09:42:18Z) - CenterRadarNet: Joint 3D Object Detection and Tracking Framework using
4D FMCW Radar [28.640714690346353]
CenterRadarNet is designed to facilitate high-resolution representation learning from 4D (Doppler-range-azimuth-ele) radar data.
As a single-stage 3D object detector, CenterRadarNet infers the BEV object distribution confidence maps, corresponding 3D bounding box attributes, and appearance embedding for each pixel.
In diverse driving scenarios, CenterRadarNet shows consistent, robust performance, emphasizing its wide applicability.
arXiv Detail & Related papers (2023-11-02T17:36:40Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object
Detection [59.765645791588454]
Recently introduced RTS3D builds an efficient 4D Feature-Consistency Embedding space for the intermediate representation of object without depth supervision.
We propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
Our proposed method has 2.57% improvement on AP3d almost without extra network parameters.
arXiv Detail & Related papers (2021-06-18T09:14:55Z) - PC-DAN: Point Cloud based Deep Affinity Network for 3D Multi-Object
Tracking (Accepted as an extended abstract in JRDB-ACT Workshop at CVPR21) [68.12101204123422]
A point cloud is a dense compilation of spatial data in 3D coordinates.
We propose a PointNet-based approach for 3D Multi-Object Tracking (MOT)
arXiv Detail & Related papers (2021-06-03T05:36:39Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z) - RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars.
We propose a new solution that exploits both LiDAR and Radar sensors for perception.
Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.