Gated Cross-Attention Network for Depth Completion
- URL: http://arxiv.org/abs/2309.16301v2
- Date: Sun, 21 Jan 2024 16:27:06 GMT
- Title: Gated Cross-Attention Network for Depth Completion
- Authors: Xiaogang Jia, Songlei Jian, Yusong Tan, Yonggang Che, Wei Chen and
Zhengfa Liang
- Abstract summary: Fusion of color and depth features is the current critical challenge in the field of depth estimation.
We design an efficient Gated Cross-Attention Network that propagates confidence via a gating mechanism.
We use the Ray Tune mechanism with the AsyncHyperBandScheduler scheduler and the HyperOptSearch algorithm to automatically search for the optimal number of module iterations.
- Score: 11.154059839763738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth completion is a popular research direction in the field of depth
estimation. The fusion of color and depth features is the current critical
challenge in this task, mainly due to the asymmetry between the rich scene
details in color images and the sparse pixels in depth maps. To tackle this
issue, we design an efficient Gated Cross-Attention Network that propagates
confidence via a gating mechanism, simultaneously extracting and refining key
information in both color and depth branches to achieve local spatial feature
fusion. Additionally, we employ an attention network based on the Transformer
in low-dimensional space to effectively fuse global features and increase the
network's receptive field. With a simple yet efficient gating mechanism, our
proposed method achieves fast and accurate depth completion without the need
for additional branches or post-processing steps. At the same time, we use the
Ray Tune mechanism with the AsyncHyperBandScheduler scheduler and the
HyperOptSearch algorithm to automatically search for the optimal number of
module iterations, which also allows us to achieve performance comparable to
state-of-the-art methods. We conduct experiments on both indoor and outdoor
scene datasets. Our fast network achieves Pareto-optimal solutions in terms of
time and accuracy, and at the time of submission, our accurate network ranks
first among all published papers on the KITTI official website in terms of
accuracy.
Related papers
- A Concise but High-performing Network for Image Guided Depth Completion in Autonomous Driving [31.916031298901988]
We propose a concise but effective network, named CENet, to achieve high-performance depth completion.
Unlike other commonly used complicated guidance modules, our approach is intuitive and low-cost.
Based on the simple structure of dual-encoder and single-decoder, our CENet can achieve superior balance between accuracy and efficiency.
arXiv Detail & Related papers (2024-01-29T06:06:45Z) - RGB-D based Stair Detection using Deep Learning for Autonomous Stair
Climbing [6.362951673024623]
We propose a neural network architecture with inputs of both RGB map and depth map.
Specifically, we design the selective module which can make the network learn the complementary relationship between RGB map and depth map.
Experiments on our dataset show that our method can achieve better accuracy and recall compared with the previous state-of-the-art deep learning method.
arXiv Detail & Related papers (2022-12-02T11:22:52Z) - Neural Architecture Search for Efficient Uncalibrated Deep Photometric
Stereo [105.05232615226602]
We leverage differentiable neural architecture search (NAS) strategy to find uncalibrated PS architecture automatically.
Experiments on the DiLiGenT dataset show that the automatically searched neural architectures performance compares favorably with the state-of-the-art uncalibrated PS methods.
arXiv Detail & Related papers (2021-10-11T21:22:17Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.