Related papers: Learning Deformable Hypothesis Sampling for Accurate PatchMatch Multi-View Stereo

Learning Deformable Hypothesis Sampling for Accurate PatchMatch Multi-View Stereo

URL: http://arxiv.org/abs/2312.15970v1
Date: Tue, 26 Dec 2023 09:36:21 GMT
Title: Learning Deformable Hypothesis Sampling for Accurate PatchMatch Multi-View Stereo
Authors: Hongjie Li, Yao Guo, Xianwei Zheng, Hanjiang Xiong
Abstract summary: This paper introduces a learnable Deformable Hypothesis Sampler (DeformSampler) to address the challenging issue of noisy depth estimation. We develop DeformSampler to learn distribution-sensitive sample spaces to propagate depths consistent with the scene's geometry. We integrate DeformSampler into a learnable PatchMatch MVS system to enhance depth estimation in challenging areas.
Score: 4.6332064055042865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a learnable Deformable Hypothesis Sampler (DeformSampler) to address the challenging issue of noisy depth estimation for accurate PatchMatch Multi-View Stereo (MVS). We observe that the heuristic depth hypothesis sampling modes employed by PatchMatch MVS solvers are insensitive to (i) the piece-wise smooth distribution of depths across the object surface, and (ii) the implicit multi-modal distribution of depth prediction probabilities along the ray direction on the surface points. Accordingly, we develop DeformSampler to learn distribution-sensitive sample spaces to (i) propagate depths consistent with the scene's geometry across the object surface, and (ii) fit a Laplace Mixture model that approaches the point-wise probabilities distribution of the actual depths along the ray direction. We integrate DeformSampler into a learnable PatchMatch MVS system to enhance depth estimation in challenging areas, such as piece-wise discontinuous surface boundaries and weakly-textured regions. Experimental results on DTU and Tanks \& Temples datasets demonstrate its superior performance and generalization capabilities compared to state-of-the-art competitors. Code is available at https://github.com/Geo-Tell/DS-PMNet.

Related papers

DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation. We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features. We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z)
Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios. We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture. Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z)
V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints [6.7197802356130465]
We introduce a learning-based depth map fusion framework that accepts a set of depth and confidence maps generated by a Multi-View Stereo (MVS) algorithm as input and improves them. We also introduce a depth search window estimation sub-network trained jointly with the larger fusion sub-network to reduce the depth hypothesis search space along each ray. Our method learns to model depth consensus and violations of visibility constraints directly from the data.
arXiv Detail & Related papers (2023-08-17T00:39:56Z)
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models [69.50316788263433]
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained vision-language models. We quantify the calibration of embedding uncertainties in retrieval tasks and show that ProbVLM outperforms other methods. We present a novel technique for visualizing the embedding distributions using a large-scale pre-trained latent diffusion model.
arXiv Detail & Related papers (2023-07-01T18:16:06Z)
HSurf-Net: Normal Estimation for 3D Point Clouds by Learning Hyper Surfaces [54.77683371400133]
We propose a novel normal estimation method called HSurf-Net, which can accurately predict normals from point clouds with noise and density variations. Experimental results show that our HSurf-Net achieves the state-of-the-art performance on the synthetic shape dataset.
arXiv Detail & Related papers (2022-10-13T16:39:53Z)
DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis [11.346448410152844]
In this paper, we propose the DS-MVSNet, an end-to-end unsupervised MVS structure with the source depths synthesis. To mine the information in probability volume, we creatively synthesize the source depths by splattering the probability volume and depth hypotheses to source views. On the other hand, we utilize the source depths to render the reference images and propose depth consistency loss and depth smoothness loss.
arXiv Detail & Related papers (2022-08-13T15:25:51Z)
Non-parametric Depth Distribution Modelling based Depth Inference for Multi-view Stereo [43.415242967722804]
Recent cost volume pyramid based deep neural networks have unlocked the potential of efficiently leveraging high-resolution images for depth inference from multi-view stereo. In general, those approaches assume that the depth of each pixel follows a unimodal distribution. We propose constructing the cost volume by non-parametric depth distribution modeling to handle pixels with unimodal and multi-modal distributions.
arXiv Detail & Related papers (2022-05-08T05:13:04Z)
A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo [41.527018997251744]
We introduce a deep multi-view stereo (MVS) system that jointly predicts depths, surface normals and per-view confidence maps. The key to our approach is a novel solver that iteratively solves for per-view depth map and normal map. Our proposed solver consistently improves the depth quality over both conventional and deep learning based MVS pipelines.
arXiv Detail & Related papers (2022-01-19T14:08:45Z)
SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings [2.534402217750793]
We present an approach to learn dense, continuous 2D-3D correspondence distributions over the surface of objects from data. We also present a new method for 6D pose estimation of rigid objects using the learnt distributions to sample, score and refine pose hypotheses.
arXiv Detail & Related papers (2021-11-26T13:39:38Z)
SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures. Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities. We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z)
PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization [64.39761523935613]
We present a new framework for Patch Distribution Modeling, PaDiM, to concurrently detect and localize anomalies in images. PaDiM makes use of a pretrained convolutional neural network (CNN) for patch embedding. It also exploits correlations between the different semantic levels of CNN to better localize anomalies.
arXiv Detail & Related papers (2020-11-17T17:29:18Z)
OmniSLAM: Omnidirectional Localization and Dense Mapping for Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras. For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation. We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.