Related papers: DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

URL: http://arxiv.org/abs/2108.12655v1
Date: Sat, 28 Aug 2021 14:18:29 GMT
Title: DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network
Authors: Jiaqi Gu, Zhiyu Xiang, Yuwen Ye, Lingxuan Wang
Abstract summary: We propose DenseLiDAR, a novel real-time pseudo-depth guided depth completion neural network. We exploit dense pseudo-depth map obtained from simple morphological operations to guide the network. Our model is able to achieve the state-of-the-art performance at the highest frame rate of 50Hz.
Score: 3.1447111126464997
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Depth Completion can produce a dense depth map from a sparse input and provide a more complete 3D description of the environment. Despite great progress made in depth completion, the sparsity of the input and low density of the ground truth still make this problem challenging. In this work, we propose DenseLiDAR, a novel real-time pseudo-depth guided depth completion neural network. We exploit dense pseudo-depth map obtained from simple morphological operations to guide the network in three aspects: (1) Constructing a residual structure for the output; (2) Rectifying the sparse input data; (3) Providing dense structural loss for training the network. Thanks to these novel designs, higher performance of the output could be achieved. In addition, two new metrics for better evaluating the quality of the predicted depth map are also presented. Extensive experiments on KITTI depth completion benchmark suggest that our model is able to achieve the state-of-the-art performance at the highest frame rate of 50Hz. The predicted dense depth is further evaluated by several downstream robotic perception or positioning tasks. For the task of 3D object detection, 3~5 percent performance gains on small objects categories are achieved on KITTI 3D object detection dataset. For RGB-D SLAM, higher accuracy on vehicle's trajectory is also obtained in KITTI Odometry dataset. These promising results not only verify the high quality of our depth prediction, but also demonstrate the potential of improving the related downstream tasks by using depth completion results.

Related papers

GAC-Net_Geometric and attention-based Network for Depth Completion [10.64600095082433]
This paper proposes a depth completion network combining channel attention mechanism and 3D global feature perception (CGA-Net) Experiments on the KITTI depth completion dataset show that CGA-Net can significantly improve the prediction accuracy of dense depth maps.
arXiv Detail & Related papers (2025-01-14T10:24:20Z)
DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications. This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors. Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z)
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation [108.04354143020886]
We introduce prompting into depth foundation models, creating a new paradigm for metric depth estimation termed Prompt Depth Anything. We use a low-cost LiDAR as the prompt to guide the Depth Anything model for accurate metric depth output, achieving up to 4K resolution.
arXiv Detail & Related papers (2024-12-18T16:32:12Z)
DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation. We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features. We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z)
Self-Supervised Depth Completion Guided by 3D Perception and Geometry Consistency [17.68427514090938]
This paper explores the utilization of 3D perceptual features and multi-view geometry consistency to devise a high-precision self-supervised depth completion method. Experiments on benchmark datasets of NYU-Depthv2 and VOID demonstrate that the proposed model achieves the state-of-the-art depth completion performance.
arXiv Detail & Related papers (2023-12-23T14:19:56Z)
Boosting Monocular 3D Object Detection with Object-Centric Auxiliary Depth Supervision [13.593246617391266]
We propose a method to boost the RGB image-based 3D detector by jointly training the detection network with a depth prediction loss analogous to the depth estimation task. Our novel object-centric depth prediction loss focuses on depth around foreground objects, which is important for 3D object detection. Our depth regression model is further trained to predict the uncertainty of depth to represent the 3D confidence of objects.
arXiv Detail & Related papers (2022-10-29T11:32:28Z)
Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD) Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z)
Sparse Depth Completion with Semantic Mesh Deformation Optimization [4.03103540543081]
We propose a neural network with post-optimization, which takes an RGB image and sparse depth samples as input and predicts the complete depth map. Our evaluation results outperform the existing work consistently on both indoor and outdoor datasets.
arXiv Detail & Related papers (2021-12-10T13:01:06Z)
3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method. Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions. We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z)
VR3Dense: Voxel Representation Learning for 3D Object Detection and Monocular Dense Depth Reconstruction [0.951828574518325]
We introduce a method for jointly training 3D object detection and monocular dense depth reconstruction neural networks. It takes as inputs, a LiDAR point-cloud, and a single RGB image during inference and produces object pose predictions as well as a densely reconstructed depth map. While our object detection is trained in a supervised manner, the depth prediction network is trained with both self-supervised and supervised loss functions.
arXiv Detail & Related papers (2021-04-13T04:25:54Z)
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars. In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors. We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z)
PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net. Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z)
Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction. By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation. We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z)
DELTAS: Depth Estimation by Learning Triangulation And densification of Sparse points [14.254472131009653]
Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation. Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems. We propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs.
arXiv Detail & Related papers (2020-03-19T17:56:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.