High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion
- URL: http://arxiv.org/abs/2104.01530v1
- Date: Sun, 4 Apr 2021 03:28:33 GMT
- Title: High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion
- Authors: Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Zhiwen Chen and
Xiangyang Ji
- Abstract summary: We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
- Score: 84.24973877109181
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Depth map records distance between the viewpoint and objects in the scene,
which plays a critical role in many real-world applications. However, depth map
captured by consumer-grade RGB-D cameras suffers from low spatial resolution.
Guided depth map super-resolution (DSR) is a popular approach to address this
problem, which attempts to restore a high-resolution (HR) depth map from the
input low-resolution (LR) depth and its coupled HR RGB image that serves as the
guidance.
The most challenging problems for guided DSR are how to correctly select
consistent structures and propagate them, and properly handle inconsistent
ones. In this paper, we propose a novel attention-based hierarchical
multi-modal fusion (AHMF) network for guided DSR. Specifically, to effectively
extract and combine relevant information from LR depth and HR guidance, we
propose a multi-modal attention based fusion (MMAF) strategy for hierarchical
convolutional layers, including a feature enhance block to select valuable
features and a feature recalibration block to unify the similarity metrics of
modalities with different appearance characteristics. Furthermore, we propose a
bi-directional hierarchical feature collaboration (BHFC) module to fully
leverage low-level spatial information and high-level structure information
among multi-scale features. Experimental results show that our approach
outperforms state-of-the-art methods in terms of reconstruction accuracy,
running speed and memory efficiency.
Related papers
- PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - Learning Hierarchical Color Guidance for Depth Map Super-Resolution [168.1463802622881]
We propose a hierarchical color guidance network to achieve depth map super-resolution (DSR)
On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features.
On the other hand, the high-level abstract guidance module is proposed to maintain semantic consistency in the reconstruction process.
arXiv Detail & Related papers (2024-03-12T03:44:46Z) - DSR-Diff: Depth Map Super-Resolution with Diffusion Model [38.68563026759223]
We present a novel CDSR paradigm that utilizes a diffusion model within the latent space to generate guidance for depth map super-resolution.
Our proposed method has shown superior performance in extensive experiments when compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-11-16T14:18:10Z) - Multi-Depth Branch Network for Efficient Image Super-Resolution [12.042706918188566]
A longstanding challenge in Super-Resolution (SR) is how to efficiently enhance high-frequency details in Low-Resolution (LR) images.
We propose an asymmetric SR architecture featuring Multi-Depth Branch Module (MDBM)
MDBMs contain branches of different depths, designed to capture high- and low-frequency information simultaneously and efficiently.
arXiv Detail & Related papers (2023-09-29T15:46:25Z) - Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - Memory-augmented Deep Unfolding Network for Guided Image
Super-resolution [67.83489239124557]
Guided image super-resolution (GISR) aims to obtain a high-resolution (HR) target image by enhancing the spatial resolution of a low-resolution (LR) target image under the guidance of a HR image.
Previous model-based methods mainly takes the entire image as a whole, and assume the prior distribution between the HR target image and the HR guidance image.
We propose a maximal a posterior (MAP) estimation model for GISR with two types of prior on the HR target image.
arXiv Detail & Related papers (2022-02-12T15:37:13Z) - BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and
Monocular Depth Estimation [60.34562823470874]
We propose a joint learning network of depth map super-resolution (DSR) and monocular depth estimation (MDE) without introducing additional supervision labels.
One is the high-frequency attention bridge (HABdg) designed for the feature encoding process, which learns the high-frequency information of the MDE task to guide the DSR task.
The other is the content guidance bridge (CGBdg) designed for the depth map reconstruction process, which provides the content guidance learned from DSR task for MDE task.
arXiv Detail & Related papers (2021-07-27T01:28:23Z) - Progressive Multi-scale Fusion Network for RGB-D Salient Object
Detection [9.099589602551575]
We discuss about the advantages of the so-called progressive multi-scale fusion method and propose a mask-guided feature aggregation module.
The proposed framework can effectively combine the two features of different modalities and alleviate the impact of erroneous depth features.
We further introduce a mask-guided refinement module(MGRM) to complement the high-level semantic features and reduce the irrelevant features from multi-scale fusion.
arXiv Detail & Related papers (2021-06-07T20:02:39Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Multi-Scale Progressive Fusion Learning for Depth Map Super-Resolution [11.072332820377612]
The resolution of depth map collected by depth camera is often lower than that of its associated RGB camera.
A major problem with depth map super-resolution is that there will be obvious jagged edges and excessive loss of details.
We propose a multi-scale progressive fusion network for depth map SR, which possess an structure to integrate hierarchical features in different domains.
arXiv Detail & Related papers (2020-11-24T03:03:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.