Related papers: RVMDE: Radar Validated Monocular Depth Estimation for Robotics

RVMDE: Radar Validated Monocular Depth Estimation for Robotics

URL: http://arxiv.org/abs/2109.05265v1
Date: Sat, 11 Sep 2021 12:02:29 GMT
Title: RVMDE: Radar Validated Monocular Depth Estimation for Robotics
Authors: Muhamamd Ishfaq Hussain, Muhammad Aasim Rafique and Moongu Jeon
Abstract summary: An innate rigid calibration of binocular vision sensors is crucial for accurate depth estimation. Alternatively, a monocular camera alleviates the limitation at the expense of accuracy in estimating depth, and the challenge exacerbates in harsh environmental conditions. This work explores the utility of coarse signals from radar when fused with fine-grained data from a monocular camera for depth estimation in harsh environmental conditions.
Score: 5.360594929347198
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stereoscopy exposits a natural perception of distance in a scene, and its manifestation in 3D world understanding is an intuitive phenomenon. However, an innate rigid calibration of binocular vision sensors is crucial for accurate depth estimation. Alternatively, a monocular camera alleviates the limitation at the expense of accuracy in estimating depth, and the challenge exacerbates in harsh environmental conditions. Moreover, an optical sensor often fails to acquire vital signals in harsh environments, and radar is used instead, which gives coarse but more accurate signals. This work explores the utility of coarse signals from radar when fused with fine-grained data from a monocular camera for depth estimation in harsh environmental conditions. A variant of feature pyramid network (FPN) extensively operates on fine-grained image features at multiple scales with a fewer number of parameters. FPN feature maps are fused with sparse radar features extracted with a Convolutional neural network. The concatenated hierarchical features are used to predict the depth with ordinal regression. We performed experiments on the nuScenes dataset, and the proposed architecture stays on top in quantitative evaluations with reduced parameters and faster inference. The depth estimation results suggest that the proposed techniques can be used as an alternative to stereo depth estimation in critical applications in robotics and self-driving cars. The source code will be available in the following: \url{https://github.com/MI-Hussain/RVMDE}.

Related papers

RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance. Radar suffers from noise and positional ambiguity. We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z)
GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling [7.90238039959534]
Existing algorithms process radar data by projecting 3D points onto the image plane for pixel-level feature extraction. We propose GET-UP, leveraging attention-enhanced Graph Neural Networks (GNN) to exchange and aggregate both 2D and 3D information from radar data. We benchmark our proposed GET-UP on the nuScenes dataset, achieving state-of-the-art performance with a 15.3% and 14.7% improvement in MAE and RMSE over the previously best-performing model.
arXiv Detail & Related papers (2024-09-02T14:15:09Z)
NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection [72.0098999512727]
NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by utilizing NeRF to enhance representation learning. We present three corresponding solutions, including semantic enhancement, perspective-aware sampling, and ordinal depth supervision. The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and AR KITScenes datasets.
arXiv Detail & Related papers (2024-02-22T11:48:06Z)
RIDERS: Radar-Infrared Depth Estimation for Robust Sensing [22.10378524682712]
Adverse weather conditions pose significant challenges to accurate dense depth estimation. We present a novel approach for robust metric depth estimation by fusing a millimeter-wave Radar and a monocular infrared thermal camera. Our method achieves exceptional visual quality and accurate metric estimation by addressing the challenges of ambiguity and misalignment.
arXiv Detail & Related papers (2024-02-03T07:14:43Z)
Echoes Beyond Points: Unleashing the Power of Raw Radar Data in Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline. Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z)
iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [64.80458128766254]
iSDF is a continuous learning system for real-time signed distance field reconstruction. It produces more accurate reconstructions and better approximations of collision costs and gradients.
arXiv Detail & Related papers (2022-04-05T15:48:39Z)
Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations. In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z)
Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems. Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results. This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z)
VR3Dense: Voxel Representation Learning for 3D Object Detection and Monocular Dense Depth Reconstruction [0.951828574518325]
We introduce a method for jointly training 3D object detection and monocular dense depth reconstruction neural networks. It takes as inputs, a LiDAR point-cloud, and a single RGB image during inference and produces object pose predictions as well as a densely reconstructed depth map. While our object detection is trained in a supervised manner, the depth prediction network is trained with both self-supervised and supervised loss functions.
arXiv Detail & Related papers (2021-04-13T04:25:54Z)
Multi-Modal Depth Estimation Using Convolutional Neural Networks [0.8701566919381223]
This paper addresses the problem of dense depth predictions from sparse distance sensor data and a single camera image on challenging weather conditions. It explores the significance of different sensor modalities such as camera, Radar, and Lidar for estimating depth by applying Deep Learning approaches.
arXiv Detail & Related papers (2020-12-17T15:31:49Z)
All-Weather Object Recognition Using Radar and Infrared Sensing [1.7513645771137178]
This thesis explores new sensing developments based on long wave polarised infrared (IR) imagery and imaging radar to recognise objects. First, we developed a methodology based on Stokes parameters using polarised infrared data to recognise vehicles using deep neural networks. Second, we explored the potential of using only the power spectrum captured by low-THz radar sensors to perform object recognition in a controlled scenario. Last, we created a new large-scale dataset in the "wild" with many different weather scenarios showing radar robustness to detect vehicles in adverse weather.
arXiv Detail & Related papers (2020-10-30T14:16:39Z)
Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras. Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric. By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.