Scene Prior Filtering for Depth Map Super-Resolution
- URL: http://arxiv.org/abs/2402.13876v2
- Date: Fri, 23 Feb 2024 08:31:27 GMT
- Title: Scene Prior Filtering for Depth Map Super-Resolution
- Authors: Zhengxue Wang and Zhiqiang Yan and Ming-Hsuan Yang and Jinshan Pan and
Jian Yang and Ying Tai and Guangwei Gao
- Abstract summary: We introduce a Scene Prior Filtering network, SPFNet, to mitigate texture interference and edge inaccuracy.
Our SPFNet has been extensively evaluated on both real and synthetic datasets, achieving state-of-the-art performance.
- Score: 102.18062150182644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal fusion is vital to the success of super-resolution of depth maps.
However, commonly used fusion strategies, such as addition and concatenation,
fall short of effectively bridging the modal gap. As a result, guided image
filtering methods have been introduced to mitigate this issue. Nevertheless, it
is observed that their filter kernels usually encounter significant texture
interference and edge inaccuracy. To tackle these two challenges, we introduce
a Scene Prior Filtering network, SPFNet, which utilizes the priors surface
normal and semantic map from large-scale models. Specifically, we design an
All-in-one Prior Propagation that computes the similarity between multi-modal
scene priors, i.e., RGB, normal, semantic, and depth, to reduce the texture
interference. In addition, we present a One-to-one Prior Embedding that
continuously embeds each single-modal prior into depth using Mutual Guided
Filtering, further alleviating the texture interference while enhancing edges.
Our SPFNet has been extensively evaluated on both real and synthetic datasets,
achieving state-of-the-art performance.
Related papers
- The Devil is in the Edges: Monocular Depth Estimation with Edge-aware Consistency Fusion [30.03608191629917]
This paper presents a novel monocular depth estimation method, named ECFNet, for estimating high-quality monocular depth with clear edges and valid overall structure from a single RGB image.
We make a thorough inquiry about the key factor that affects the edge depth estimation of the MDE networks, and come to a ratiocination that the edge information itself plays a critical role in predicting depth details.
arXiv Detail & Related papers (2024-03-30T13:58:19Z) - Bilateral Propagation Network for Depth Completion [41.163328523175466]
Depth completion aims to derive a dense depth map from sparse depth measurements with a synchronized color image.
Current state-of-the-art (SOTA) methods are predominantly propagation-based, which work as an iterative refinement on the initial estimated dense depth.
We present a Bilateral Propagation Network (BP-Net), that propagates depth at the earliest stage to avoid directly convolving on sparse data.
arXiv Detail & Related papers (2024-03-17T16:48:46Z) - SRFNet: Monocular Depth Estimation with Fine-grained Structure via Spatial Reliability-oriented Fusion of Frames and Events [5.800516204046145]
Traditional frame-based methods suffer from performance drops due to limited dynamic range and motion blur.
Recent works leverage novel event cameras to complement or guide the frame modality via frame-event feature fusion.
SRFNet can estimate depth with fine-grained structure at both daytime and nighttime.
arXiv Detail & Related papers (2023-09-22T12:59:39Z) - Unpaired Overwater Image Defogging Using Prior Map Guided CycleGAN [60.257791714663725]
We propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes.
The proposed method outperforms the state-of-the-art supervised, semi-supervised, and unsupervised defogging approaches.
arXiv Detail & Related papers (2022-12-23T03:00:28Z) - MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image
Inpainting [35.79101039727397]
We study the advantages and challenges of image-level predictive filtering for image inpainting.
We propose a novel filtering technique, i.e., Multilevel Interactive Siamese Filtering (MISF), which contains two branches: kernel prediction branch (KPB) and semantic & image filtering branch (SIFB)
Our method outperforms state-of-the-art baselines on four metrics, i.e., L1, PSNR, SSIM, and LPIPS.
arXiv Detail & Related papers (2022-03-12T01:32:39Z) - Unsharp Mask Guided Filtering [53.14430987860308]
The goal of this paper is guided image filtering, which emphasizes the importance of structure transfer during filtering.
We propose a new and simplified formulation of the guided filter inspired by unsharp masking.
Our formulation enjoys a filtering prior to a low-pass filter and enables explicit structure transfer by estimating a single coefficient.
arXiv Detail & Related papers (2021-06-02T19:15:34Z) - Boundary-induced and scene-aggregated network for monocular depth
prediction [20.358133522462513]
We propose the Boundary-induced and Scene-aggregated network (BS-Net) to predict the dense depth of a single RGB image.
Several experimental results on the NYUD v2 dataset and xffthe iBims-1 dataset illustrate the state-of-the-art performance of the proposed approach.
arXiv Detail & Related papers (2021-02-26T01:43:17Z) - NeuralFusion: Online Depth Fusion in Latent Space [77.59420353185355]
We present a novel online depth map fusion approach that learns depth map aggregation in a latent feature space.
Our approach is real-time capable, handles high noise levels, and is particularly able to deal with gross outliers common for photometric stereo-based depth maps.
arXiv Detail & Related papers (2020-11-30T13:50:59Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.