Related papers: Gated Cross-Attention Network for Depth Completion

Gated Cross-Attention Network for Depth Completion

URL: http://arxiv.org/abs/2309.16301v2
Date: Sun, 21 Jan 2024 16:27:06 GMT
Title: Gated Cross-Attention Network for Depth Completion
Authors: Xiaogang Jia, Songlei Jian, Yusong Tan, Yonggang Che, Wei Chen and Zhengfa Liang
Abstract summary: Fusion of color and depth features is the current critical challenge in the field of depth estimation. We design an efficient Gated Cross-Attention Network that propagates confidence via a gating mechanism. We use the Ray Tune mechanism with the AsyncHyperBandScheduler scheduler and the HyperOptSearch algorithm to automatically search for the optimal number of module iterations.
Score: 11.154059839763738
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth completion is a popular research direction in the field of depth estimation. The fusion of color and depth features is the current critical challenge in this task, mainly due to the asymmetry between the rich scene details in color images and the sparse pixels in depth maps. To tackle this issue, we design an efficient Gated Cross-Attention Network that propagates confidence via a gating mechanism, simultaneously extracting and refining key information in both color and depth branches to achieve local spatial feature fusion. Additionally, we employ an attention network based on the Transformer in low-dimensional space to effectively fuse global features and increase the network's receptive field. With a simple yet efficient gating mechanism, our proposed method achieves fast and accurate depth completion without the need for additional branches or post-processing steps. At the same time, we use the Ray Tune mechanism with the AsyncHyperBandScheduler scheduler and the HyperOptSearch algorithm to automatically search for the optimal number of module iterations, which also allows us to achieve performance comparable to state-of-the-art methods. We conduct experiments on both indoor and outdoor scene datasets. Our fast network achieves Pareto-optimal solutions in terms of time and accuracy, and at the time of submission, our accurate network ranks first among all published papers on the KITTI official website in terms of accuracy.

Related papers

A Concise but High-performing Network for Image Guided Depth Completion in Autonomous Driving [31.916031298901988]
We propose a concise but effective network, named CENet, to achieve high-performance depth completion. Unlike other commonly used complicated guidance modules, our approach is intuitive and low-cost. Based on the simple structure of dual-encoder and single-decoder, our CENet can achieve superior balance between accuracy and efficiency.
arXiv Detail & Related papers (2024-01-29T06:06:45Z)
RGB-D based Stair Detection using Deep Learning for Autonomous Stair Climbing [6.362951673024623]
We propose a neural network architecture with inputs of both RGB map and depth map. Specifically, we design the selective module which can make the network learn the complementary relationship between RGB map and depth map. Experiments on our dataset show that our method can achieve better accuracy and recall compared with the previous state-of-the-art deep learning method.
arXiv Detail & Related papers (2022-12-02T11:22:52Z)
Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo [105.05232615226602]
We leverage differentiable neural architecture search (NAS) strategy to find uncalibrated PS architecture automatically. Experiments on the DiLiGenT dataset show that the automatically searched neural architectures performance compares favorably with the state-of-the-art uncalibrated PS methods.
arXiv Detail & Related papers (2021-10-11T21:22:17Z)
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars. In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors. We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z)
Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion. In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth. This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection. The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.