SemAttNet: Towards Attention-based Semantic Aware Guided Depth
Completion
- URL: http://arxiv.org/abs/2204.13635v1
- Date: Thu, 28 Apr 2022 16:53:25 GMT
- Title: SemAttNet: Towards Attention-based Semantic Aware Guided Depth
Completion
- Authors: Danish Nazir, Marcus Liwicki, Didier Stricker, Muhammad Zeshan Afzal
- Abstract summary: We propose a novel three-branch backbone comprising color-guided, semantic-guided, and depth-guided branches.
The predicted dense depth map of color-guided branch along-with semantic image and sparse depth map is passed as input to semantic-guided branch.
The depth-guided branch takes sparse, color, and semantic depths to generate the dense depth map.
- Score: 12.724769241831396
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Depth completion involves recovering a dense depth map from a sparse map and
an RGB image. Recent approaches focus on utilizing color images as guidance
images to recover depth at invalid pixels. However, color images alone are not
enough to provide the necessary semantic understanding of the scene.
Consequently, the depth completion task suffers from sudden illumination
changes in RGB images (e.g., shadows). In this paper, we propose a novel
three-branch backbone comprising color-guided, semantic-guided, and
depth-guided branches. Specifically, the color-guided branch takes a sparse
depth map and RGB image as an input and generates color depth which includes
color cues (e.g., object boundaries) of the scene. The predicted dense depth
map of color-guided branch along-with semantic image and sparse depth map is
passed as input to semantic-guided branch for estimating semantic depth. The
depth-guided branch takes sparse, color, and semantic depths to generate the
dense depth map. The color depth, semantic depth, and guided depth are
adaptively fused to produce the output of our proposed three-branch backbone.
In addition, we also propose to apply semantic-aware multi-modal
attention-based fusion block (SAMMAFB) to fuse features between all three
branches. We further use CSPN++ with Atrous convolutions to refine the dense
depth map produced by our three-branch backbone. Extensive experiments show
that our model achieves state-of-the-art performance in the KITTI depth
completion benchmark at the time of submission.
Related papers
- DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.
This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.
Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion [51.69876947593144]
Existing methods for depth completion operate in tightly constrained settings.
Inspired by advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation.
Marigold-DC builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance.
arXiv Detail & Related papers (2024-12-18T00:06:41Z) - Depth-guided Texture Diffusion for Image Semantic Segmentation [47.46257473475867]
We introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge.
Our method extracts low-level features from edges and textures to create a texture image.
By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image.
arXiv Detail & Related papers (2024-08-17T04:55:03Z) - RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth
Completion [31.70022495622075]
We explore a repetitive design in our image guided network to gradually and sufficiently recover depth values.
In the former branch, we design a dense repetitive hourglass network (DRHN) to extract discriminative image features of complex environments.
In the latter branch, we present a repetitive guidance (RG) module based on dynamic convolution, in which an efficient convolution factorization is proposed to reduce the complexity.
In addition, we propose a region-aware spatial propagation network (RASPN) for further depth refinement based on the semantic prior constraint.
arXiv Detail & Related papers (2023-09-01T09:11:20Z) - RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion [28.634851863097953]
We propose a novel two-branch end-to-end fusion network named RDFC-GAN.
It takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.
The first branch employs an encoder-decoder structure, by adhering to the Manhattan world assumption.
The other branch applies an RGB-depth fusion CycleGAN, adept at translating RGB imagery into detailed, textured depth maps.
arXiv Detail & Related papers (2023-06-06T11:03:05Z) - RGB-Depth Fusion GAN for Indoor Depth Completion [29.938869342958125]
In this paper, we design a novel two-branch end-to-end fusion network, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.
In one branch, we propose an RGB-depth fusion GAN to transfer the RGB image to the fine-grained textured depth map.
In the other branch, we adopt adaptive fusion modules named W-AdaIN to propagate the features across the two branches.
arXiv Detail & Related papers (2022-03-21T10:26:38Z) - DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes [68.38952377590499]
We present a novel approach for estimating depth from a monocular camera as it moves through complex indoor environments.
Our approach predicts absolute scale depth maps over the entire scene consisting of a static background and multiple moving people.
arXiv Detail & Related papers (2021-08-12T09:12:39Z) - PENet: Towards Precise and Efficient Image Guided Depth Completion [11.162415111320625]
How to fuse the color and depth modalities plays an important role in achieving good performance.
This paper proposes a two-branch backbone that consists of a color-dominant branch and a depth-dominant branch.
The proposed full model ranks 1st in the KITTI depth completion online leaderboard at the time of submission.
arXiv Detail & Related papers (2021-03-01T06:09:23Z) - Efficient Depth Completion Using Learned Bases [94.0808155168311]
We propose a new global geometry constraint for depth completion.
By assuming depth maps often lay on low dimensional subspaces, a dense depth map can be approximated by a weighted sum of full-resolution principal depth bases.
arXiv Detail & Related papers (2020-12-02T11:57:37Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.