The Devil is in the Details: Boosting Guided Depth Super-Resolution via
Rethinking Cross-Modal Alignment and Aggregation
- URL: http://arxiv.org/abs/2401.08123v1
- Date: Tue, 16 Jan 2024 05:37:08 GMT
- Title: The Devil is in the Details: Boosting Guided Depth Super-Resolution via
Rethinking Cross-Modal Alignment and Aggregation
- Authors: Xinni Jiang, Zengsheng Kuang, Chunle Guo, Ruixun Zhang, Lei Cai, Xiao
Fan, Chongyi Li
- Abstract summary: Guided depth super-resolution (GDSR) involves restoring missing depth details using the high-resolution RGB image of the same scene.
Previous approaches have struggled with the heterogeneity and complementarity of the multi-modal inputs, and neglected the issues of modal misalignment, geometrical misalignment, and feature selection.
- Score: 41.12790340577986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Guided depth super-resolution (GDSR) involves restoring missing depth details
using the high-resolution RGB image of the same scene. Previous approaches have
struggled with the heterogeneity and complementarity of the multi-modal inputs,
and neglected the issues of modal misalignment, geometrical misalignment, and
feature selection. In this study, we rethink some essential components in GDSR
networks and propose a simple yet effective Dynamic Dual Alignment and
Aggregation network (D2A2). D2A2 mainly consists of 1) a dynamic dual alignment
module that adapts to alleviate the modal misalignment via a learnable domain
alignment block and geometrically align cross-modal features by learning the
offset; and 2) a mask-to-pixel feature aggregate module that uses the gated
mechanism and pixel attention to filter out irrelevant texture noise from RGB
features and combine the useful features with depth features. By combining the
strengths of RGB and depth features while minimizing disturbance introduced by
the RGB image, our method with simple reuse and redesign of basic components
achieves state-of-the-art performance on multiple benchmark datasets. The code
is available at https://github.com/JiangXinni/D2A2.
Related papers
- Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - DCANet: Differential Convolution Attention Network for RGB-D Semantic
Segmentation [2.2032272277334375]
We propose a pixel differential convolution attention (DCA) module to consider geometric information and local-range correlations for depth data.
We extend DCA to ensemble differential convolution attention (EDCA) which propagates long-range contextual dependencies.
A two-branch network built with DCA and EDCA, called Differential Convolutional Network (DCANet), is proposed to fuse local and global information of two-modal data.
arXiv Detail & Related papers (2022-10-13T05:17:34Z) - Depth-Adapted CNNs for RGB-D Semantic Segmentation [2.341385717236931]
We propose a novel framework to incorporate the depth information in the RGB convolutional neural network (CNN)
Specifically, our Z-ACN generates a 2D depth-adapted offset which is fully constrained by low-level features to guide the feature extraction on RGB images.
With the generated offset, we introduce two intuitive and effective operations to replace basic CNN operators.
arXiv Detail & Related papers (2022-06-08T14:59:40Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Discrete Cosine Transform Network for Guided Depth Map Super-Resolution [19.86463937632802]
The goal is to use high-resolution (HR) RGB images to provide extra information on edges and object contours, so that low-resolution depth maps can be upsampled to HR ones.
We propose an advanced Discrete Cosine Transform Network (DCTNet), which is composed of four components.
We show that our method can generate accurate and HR depth maps, surpassing state-of-the-art methods.
arXiv Detail & Related papers (2021-04-14T17:01:03Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.