Spherical Space Feature Decomposition for Guided Depth Map
Super-Resolution
- URL: http://arxiv.org/abs/2303.08942v2
- Date: Wed, 23 Aug 2023 04:12:26 GMT
- Title: Spherical Space Feature Decomposition for Guided Depth Map
Super-Resolution
- Authors: Zixiang Zhao, Jiangshe Zhang, Xiang Gu, Chengli Tan, Shuang Xu, Yulun
Zhang, Radu Timofte, Luc Van Gool
- Abstract summary: Guided depth map super-resolution (GDSR) aims to upsample low-resolution (LR) depth maps with additional information involved in high-resolution (HR) RGB images from the same scene.
In this paper, we propose the Spherical Space feature Decomposition Network (SSDNet) to solve the above issues.
Our method can achieve state-of-the-art results on four test datasets, as well as successfully generalize to real-world scenes.
- Score: 123.04455334124188
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Guided depth map super-resolution (GDSR), as a hot topic in multi-modal image
processing, aims to upsample low-resolution (LR) depth maps with additional
information involved in high-resolution (HR) RGB images from the same scene.
The critical step of this task is to effectively extract domain-shared and
domain-private RGB/depth features. In addition, three detailed issues, namely
blurry edges, noisy surfaces, and over-transferred RGB texture, need to be
addressed. In this paper, we propose the Spherical Space feature Decomposition
Network (SSDNet) to solve the above issues. To better model cross-modality
features, Restormer block-based RGB/depth encoders are employed for extracting
local-global features. Then, the extracted features are mapped to the spherical
space to complete the separation of private features and the alignment of
shared features. Shared features of RGB are fused with the depth features to
complete the GDSR task. Subsequently, a spherical contrast refinement (SCR)
module is proposed to further address the detail issues. Patches that are
classified according to imperfect categories are input into the SCR module,
where the patch features are pulled closer to the ground truth and pushed away
from the corresponding imperfect samples in the spherical feature space via
contrastive learning. Extensive experiments demonstrate that our method can
achieve state-of-the-art results on four test datasets, as well as successfully
generalize to real-world scenes. The code is available at
\url{https://github.com/Zhaozixiang1228/GDSR-SSDNet}.
Related papers
- IGAF: Incremental Guided Attention Fusion for Depth Super-Resolution [13.04760414998408]
We propose a novel sensor fusion methodology for guided depth super-resolution (GDSR)
GDSR combines LR depth maps with HR images to estimate detailed HR depth maps.
Our model achieves state-of-the-art results compared to all baseline models on the NYU v2 dataset.
arXiv Detail & Related papers (2025-01-03T09:27:51Z) - Guided Real Image Dehazing using YCbCr Color Space [25.771316524011382]
We propose a novel Structure Guided Dehazing Network (SGDN) that leverages the superior structural properties of YCbCr features over RGB.
For effective supervised learning, we introduce a Real-World Well-Aligned Haze dataset.
Experimental results demonstrate that our method surpasses existing state-of-the-art methods across multiple real-world smoke/haze datasets.
arXiv Detail & Related papers (2024-12-23T11:53:06Z) - Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - Pyramidal Attention for Saliency Detection [30.554118525502115]
This paper exploits only RGB images, estimates depth from RGB, and leverages the intermediate depth features.
We employ a pyramidal attention structure to extract multi-level convolutional-transformer features to process initial stage representations.
We report significantly improved performance against 21 and 40 state-of-the-art SOD methods on eight RGB and RGB-D datasets.
arXiv Detail & Related papers (2022-04-14T06:57:46Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Discrete Cosine Transform Network for Guided Depth Map Super-Resolution [19.86463937632802]
The goal is to use high-resolution (HR) RGB images to provide extra information on edges and object contours, so that low-resolution depth maps can be upsampled to HR ones.
We propose an advanced Discrete Cosine Transform Network (DCTNet), which is composed of four components.
We show that our method can generate accurate and HR depth maps, surpassing state-of-the-art methods.
arXiv Detail & Related papers (2021-04-14T17:01:03Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.