SemAttNet: Towards Attention-based Semantic Aware Guided Depth
Completion
- URL: http://arxiv.org/abs/2204.13635v1
- Date: Thu, 28 Apr 2022 16:53:25 GMT
- Title: SemAttNet: Towards Attention-based Semantic Aware Guided Depth
Completion
- Authors: Danish Nazir, Marcus Liwicki, Didier Stricker, Muhammad Zeshan Afzal
- Abstract summary: We propose a novel three-branch backbone comprising color-guided, semantic-guided, and depth-guided branches.
The predicted dense depth map of color-guided branch along-with semantic image and sparse depth map is passed as input to semantic-guided branch.
The depth-guided branch takes sparse, color, and semantic depths to generate the dense depth map.
- Score: 12.724769241831396
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Depth completion involves recovering a dense depth map from a sparse map and
an RGB image. Recent approaches focus on utilizing color images as guidance
images to recover depth at invalid pixels. However, color images alone are not
enough to provide the necessary semantic understanding of the scene.
Consequently, the depth completion task suffers from sudden illumination
changes in RGB images (e.g., shadows). In this paper, we propose a novel
three-branch backbone comprising color-guided, semantic-guided, and
depth-guided branches. Specifically, the color-guided branch takes a sparse
depth map and RGB image as an input and generates color depth which includes
color cues (e.g., object boundaries) of the scene. The predicted dense depth
map of color-guided branch along-with semantic image and sparse depth map is
passed as input to semantic-guided branch for estimating semantic depth. The
depth-guided branch takes sparse, color, and semantic depths to generate the
dense depth map. The color depth, semantic depth, and guided depth are
adaptively fused to produce the output of our proposed three-branch backbone.
In addition, we also propose to apply semantic-aware multi-modal
attention-based fusion block (SAMMAFB) to fuse features between all three
branches. We further use CSPN++ with Atrous convolutions to refine the dense
depth map produced by our three-branch backbone. Extensive experiments show
that our model achieves state-of-the-art performance in the KITTI depth
completion benchmark at the time of submission.
Related papers
- Depth-guided Texture Diffusion for Image Semantic Segmentation [47.46257473475867]
We introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge.
Our method extracts low-level features from edges and textures to create a texture image.
By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image.
arXiv Detail & Related papers (2024-08-17T04:55:03Z) - RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth
Completion [31.70022495622075]
We explore a repetitive design in our image guided network to gradually and sufficiently recover depth values.
In the former branch, we design a dense repetitive hourglass network (DRHN) to extract discriminative image features of complex environments.
In the latter branch, we present a repetitive guidance (RG) module based on dynamic convolution, in which an efficient convolution factorization is proposed to reduce the complexity.
In addition, we propose a region-aware spatial propagation network (RASPN) for further depth refinement based on the semantic prior constraint.
arXiv Detail & Related papers (2023-09-01T09:11:20Z) - RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion [28.634851863097953]
We propose a novel two-branch end-to-end fusion network named RDFC-GAN.
It takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.
The first branch employs an encoder-decoder structure, by adhering to the Manhattan world assumption.
The other branch applies an RGB-depth fusion CycleGAN, adept at translating RGB imagery into detailed, textured depth maps.
arXiv Detail & Related papers (2023-06-06T11:03:05Z) - RGB-Depth Fusion GAN for Indoor Depth Completion [29.938869342958125]
In this paper, we design a novel two-branch end-to-end fusion network, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.
In one branch, we propose an RGB-depth fusion GAN to transfer the RGB image to the fine-grained textured depth map.
In the other branch, we adopt adaptive fusion modules named W-AdaIN to propagate the features across the two branches.
arXiv Detail & Related papers (2022-03-21T10:26:38Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes [68.38952377590499]
We present a novel approach for estimating depth from a monocular camera as it moves through complex indoor environments.
Our approach predicts absolute scale depth maps over the entire scene consisting of a static background and multiple moving people.
arXiv Detail & Related papers (2021-08-12T09:12:39Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - PENet: Towards Precise and Efficient Image Guided Depth Completion [11.162415111320625]
How to fuse the color and depth modalities plays an important role in achieving good performance.
This paper proposes a two-branch backbone that consists of a color-dominant branch and a depth-dominant branch.
The proposed full model ranks 1st in the KITTI depth completion online leaderboard at the time of submission.
arXiv Detail & Related papers (2021-03-01T06:09:23Z) - Efficient Depth Completion Using Learned Bases [94.0808155168311]
We propose a new global geometry constraint for depth completion.
By assuming depth maps often lay on low dimensional subspaces, a dense depth map can be approximated by a weighted sum of full-resolution principal depth bases.
arXiv Detail & Related papers (2020-12-02T11:57:37Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.