RGB-Depth Fusion GAN for Indoor Depth Completion
- URL: http://arxiv.org/abs/2203.10856v1
- Date: Mon, 21 Mar 2022 10:26:38 GMT
- Title: RGB-Depth Fusion GAN for Indoor Depth Completion
- Authors: Haowen Wang, Mingyuan Wang, Zhengping Che, Zhiyuan Xu, Xiuquan Qiao,
Mengshi Qi, Feifei Feng, Jian Tang
- Abstract summary: In this paper, we design a novel two-branch end-to-end fusion network, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.
In one branch, we propose an RGB-depth fusion GAN to transfer the RGB image to the fine-grained textured depth map.
In the other branch, we adopt adaptive fusion modules named W-AdaIN to propagate the features across the two branches.
- Score: 29.938869342958125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The raw depth image captured by the indoor depth sensor usually has an
extensive range of missing depth values due to inherent limitations such as the
inability to perceive transparent objects and limited distance range. The
incomplete depth map burdens many downstream vision tasks, and a rising number
of depth completion methods have been proposed to alleviate this issue. While
most existing methods can generate accurate dense depth maps from sparse and
uniformly sampled depth maps, they are not suitable for complementing the large
contiguous regions of missing depth values, which is common and critical. In
this paper, we design a novel two-branch end-to-end fusion network, which takes
a pair of RGB and incomplete depth images as input to predict a dense and
completed depth map. The first branch employs an encoder-decoder structure to
regress the local dense depth values from the raw depth map, with the help of
local guidance information extracted from the RGB image. In the other branch,
we propose an RGB-depth fusion GAN to transfer the RGB image to the
fine-grained textured depth map. We adopt adaptive fusion modules named W-AdaIN
to propagate the features across the two branches, and we append a confidence
fusion head to fuse the two outputs of the branches for the final depth map.
Extensive experiments on NYU-Depth V2 and SUN RGB-D demonstrate that our
proposed method clearly improves the depth completion performance, especially
in a more realistic setting of indoor environments with the help of the pseudo
depth map.
Related papers
- SteeredMarigold: Steering Diffusion Towards Depth Completion of Largely Incomplete Depth Maps [3.399289369740637]
SteeredMarigold is a training-free, zero-shot depth completion method.
It produces metric dense depth even for largely incomplete depth maps.
Our code will be publicly available.
arXiv Detail & Related papers (2024-09-16T11:52:13Z) - RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion [28.634851863097953]
We propose a novel two-branch end-to-end fusion network named RDFC-GAN.
It takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.
The first branch employs an encoder-decoder structure, by adhering to the Manhattan world assumption.
The other branch applies an RGB-depth fusion CycleGAN, adept at translating RGB imagery into detailed, textured depth maps.
arXiv Detail & Related papers (2023-06-06T11:03:05Z) - Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - SemAttNet: Towards Attention-based Semantic Aware Guided Depth
Completion [12.724769241831396]
We propose a novel three-branch backbone comprising color-guided, semantic-guided, and depth-guided branches.
The predicted dense depth map of color-guided branch along-with semantic image and sparse depth map is passed as input to semantic-guided branch.
The depth-guided branch takes sparse, color, and semantic depths to generate the dense depth map.
arXiv Detail & Related papers (2022-04-28T16:53:25Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and
Monocular Depth Estimation [60.34562823470874]
We propose a joint learning network of depth map super-resolution (DSR) and monocular depth estimation (MDE) without introducing additional supervision labels.
One is the high-frequency attention bridge (HABdg) designed for the feature encoding process, which learns the high-frequency information of the MDE task to guide the DSR task.
The other is the content guidance bridge (CGBdg) designed for the depth map reconstruction process, which provides the content guidance learned from DSR task for MDE task.
arXiv Detail & Related papers (2021-07-27T01:28:23Z) - SGTBN: Generating Dense Depth Maps from Single-Line LiDAR [13.58227120045849]
Current depth completion methods use extremely expensive 64-line LiDAR to obtain sparse depth maps.
Compared with the 64-line LiDAR, the single-line LiDAR is much less expensive and much more robust.
A single-line depth completion dataset is proposed based on the existing 64-line depth completion dataset.
arXiv Detail & Related papers (2021-06-24T13:08:35Z) - Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark
Dataset and Baseline [48.69396457721544]
We build a large-scale dataset named "RGB-D-D" to promote the study of depth map super-resolution (SR)
We provide a fast depth map super-resolution (FDSR) baseline, in which the high-frequency component adaptively decomposed from RGB image to guide the depth map SR.
For the real-world LR depth maps, our algorithm can produce more accurate HR depth maps with clearer boundaries and to some extent correct the depth value errors.
arXiv Detail & Related papers (2021-04-13T13:27:26Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - Adaptive Illumination based Depth Sensing using Deep Learning [18.72398843488572]
Various techniques have been proposed to estimate a dense depth map based on fusion of the sparse depth map measurement with the RGB image.
Recent advances in hardware enable adaptive depth measurements resulting in further improvement of the dense depth map estimation.
We show that such adaptive sampling masks can generalize well to many RGB and sparse depth fusion algorithms under a variety of sampling rates.
arXiv Detail & Related papers (2021-03-23T04:21:07Z) - Efficient Depth Completion Using Learned Bases [94.0808155168311]
We propose a new global geometry constraint for depth completion.
By assuming depth maps often lay on low dimensional subspaces, a dense depth map can be approximated by a weighted sum of full-resolution principal depth bases.
arXiv Detail & Related papers (2020-12-02T11:57:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.