Related papers: A Concise but High-performing Network for Image Guided Depth Completion in Autonomous Driving

A Concise but High-performing Network for Image Guided Depth Completion in Autonomous Driving

URL: http://arxiv.org/abs/2401.15902v2
Date: Mon, 22 Apr 2024 05:10:57 GMT
Title: A Concise but High-performing Network for Image Guided Depth Completion in Autonomous Driving
Authors: Moyun Liu, Bing Chen, Youping Chen, Jingming Xie, Lei Yao, Yang Zhang, Joey Tianyi Zhou,
Abstract summary: We propose a concise but effective network, named CENet, to achieve high-performance depth completion. Unlike other commonly used complicated guidance modules, our approach is intuitive and low-cost. Based on the simple structure of dual-encoder and single-decoder, our CENet can achieve superior balance between accuracy and efficiency.
Score: 31.916031298901988
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth completion is a crucial task in autonomous driving, aiming to convert a sparse depth map into a dense depth prediction. Due to its potentially rich semantic information, RGB image is commonly fused to enhance the completion effect. Image-guided depth completion involves three key challenges: 1) how to effectively fuse the two modalities; 2) how to better recover depth information; and 3) how to achieve real-time prediction for practical autonomous driving. To solve the above problems, we propose a concise but effective network, named CENet, to achieve high-performance depth completion with a simple and elegant structure. Firstly, we use a fast guidance module to fuse the two sensor features, utilizing abundant auxiliary features extracted from the color space. Unlike other commonly used complicated guidance modules, our approach is intuitive and low-cost. In addition, we find and analyze the optimization inconsistency problem for observed and unobserved positions, and a decoupled depth prediction head is proposed to alleviate the issue. The proposed decoupled head can better output the depth of valid and invalid positions with very few extra inference time. Based on the simple structure of dual-encoder and single-decoder, our CENet can achieve superior balance between accuracy and efficiency. In the KITTI depth completion benchmark, our CENet attains competitive performance and inference speed compared with the state-of-the-art methods. To validate the generalization of our method, we also evaluate on indoor NYUv2 dataset, and our CENet still achieve impressive results. The code of this work will be available at https://github.com/lmomoy/CHNet.

Related papers

Depth Estimation From Monocular Images With Enhanced Encoder-Decoder Architecture [0.0]
This paper introduces a novel deep learning-based approach using an encoder-decoder architecture. The Inception-ResNet-v2 model is utilized as the encoder. Experimental results on the NYU Depth V2 dataset show that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-10-15T13:46:19Z)
Gated Cross-Attention Network for Depth Completion [11.154059839763738]
Fusion of color and depth features is the current critical challenge in the field of depth estimation. We design an efficient Gated Cross-Attention Network that propagates confidence via a gating mechanism. We use the Ray Tune mechanism with the AsyncHyperBandScheduler scheduler and the HyperOptSearch algorithm to automatically search for the optimal number of module iterations.
arXiv Detail & Related papers (2023-09-28T09:54:10Z)
Deep Neighbor Layer Aggregation for Lightweight Self-Supervised Monocular Depth Estimation [1.6775954077761863]
We present a fully convolutional depth estimation network using contextual feature fusion. Compared to UNet++ and HRNet, we use high-resolution and low-resolution features to reserve information on small targets and fast-moving objects. Our method reduces the parameters without sacrificing accuracy.
arXiv Detail & Related papers (2023-09-17T13:40:15Z)
VA-DepthNet: A Variational Approach to Single Image Depth Prediction [163.14849753700682]
VA-DepthNet is a simple, effective, and accurate deep neural network approach for the single-image depth prediction problem. The paper demonstrates the usefulness of the proposed approach via extensive evaluation and ablation analysis over several benchmark datasets.
arXiv Detail & Related papers (2023-02-13T17:55:58Z)
BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation [60.34562823470874]
We propose a joint learning network of depth map super-resolution (DSR) and monocular depth estimation (MDE) without introducing additional supervision labels. One is the high-frequency attention bridge (HABdg) designed for the feature encoding process, which learns the high-frequency information of the MDE task to guide the DSR task. The other is the content guidance bridge (CGBdg) designed for the depth map reconstruction process, which provides the content guidance learned from DSR task for MDE task.
arXiv Detail & Related papers (2021-07-27T01:28:23Z)
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars. In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors. We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z)
PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net. Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z)
HR-Depth: High Resolution Self-Supervised Monocular Depth Estimation [14.81943833870932]
We present an improvedDepthNet, HR-Depth, with two effective strategies. Using Resnet-18 as the encoder, HR-Depth surpasses all pre-vious state-of-the-art(SoTA) methods with the least param-eters at both high and low resolution.
arXiv Detail & Related papers (2020-12-14T09:15:15Z)
Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios. We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth. This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.