Learning Deformable Hypothesis Sampling for Accurate PatchMatch
Multi-View Stereo
- URL: http://arxiv.org/abs/2312.15970v1
- Date: Tue, 26 Dec 2023 09:36:21 GMT
- Title: Learning Deformable Hypothesis Sampling for Accurate PatchMatch
Multi-View Stereo
- Authors: Hongjie Li, Yao Guo, Xianwei Zheng, Hanjiang Xiong
- Abstract summary: This paper introduces a learnable Deformable Hypothesis Sampler (DeformSampler) to address the challenging issue of noisy depth estimation.
We develop DeformSampler to learn distribution-sensitive sample spaces to propagate depths consistent with the scene's geometry.
We integrate DeformSampler into a learnable PatchMatch MVS system to enhance depth estimation in challenging areas.
- Score: 4.6332064055042865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a learnable Deformable Hypothesis Sampler
(DeformSampler) to address the challenging issue of noisy depth estimation for
accurate PatchMatch Multi-View Stereo (MVS). We observe that the heuristic
depth hypothesis sampling modes employed by PatchMatch MVS solvers are
insensitive to (i) the piece-wise smooth distribution of depths across the
object surface, and (ii) the implicit multi-modal distribution of depth
prediction probabilities along the ray direction on the surface points.
Accordingly, we develop DeformSampler to learn distribution-sensitive sample
spaces to (i) propagate depths consistent with the scene's geometry across the
object surface, and (ii) fit a Laplace Mixture model that approaches the
point-wise probabilities distribution of the actual depths along the ray
direction. We integrate DeformSampler into a learnable PatchMatch MVS system to
enhance depth estimation in challenging areas, such as piece-wise discontinuous
surface boundaries and weakly-textured regions. Experimental results on DTU and
Tanks \& Temples datasets demonstrate its superior performance and
generalization capabilities compared to state-of-the-art competitors. Code is
available at https://github.com/Geo-Tell/DS-PMNet.
Related papers
- DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
In this paper, we present DepthSplat to connect Gaussian splatting and depth estimation.
We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features.
We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z) - Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios.
We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture.
Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z) - V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints [6.7197802356130465]
We introduce a learning-based depth map fusion framework that accepts a set of depth and confidence maps generated by a Multi-View Stereo (MVS) algorithm as input and improves them.
We also introduce a depth search window estimation sub-network trained jointly with the larger fusion sub-network to reduce the depth hypothesis search space along each ray.
Our method learns to model depth consensus and violations of visibility constraints directly from the data.
arXiv Detail & Related papers (2023-08-17T00:39:56Z) - ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models [69.50316788263433]
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained vision-language models.
We quantify the calibration of embedding uncertainties in retrieval tasks and show that ProbVLM outperforms other methods.
We present a novel technique for visualizing the embedding distributions using a large-scale pre-trained latent diffusion model.
arXiv Detail & Related papers (2023-07-01T18:16:06Z) - HSurf-Net: Normal Estimation for 3D Point Clouds by Learning Hyper
Surfaces [54.77683371400133]
We propose a novel normal estimation method called HSurf-Net, which can accurately predict normals from point clouds with noise and density variations.
Experimental results show that our HSurf-Net achieves the state-of-the-art performance on the synthetic shape dataset.
arXiv Detail & Related papers (2022-10-13T16:39:53Z) - DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis [11.346448410152844]
In this paper, we propose the DS-MVSNet, an end-to-end unsupervised MVS structure with the source depths synthesis.
To mine the information in probability volume, we creatively synthesize the source depths by splattering the probability volume and depth hypotheses to source views.
On the other hand, we utilize the source depths to render the reference images and propose depth consistency loss and depth smoothness loss.
arXiv Detail & Related papers (2022-08-13T15:25:51Z) - Non-parametric Depth Distribution Modelling based Depth Inference for
Multi-view Stereo [43.415242967722804]
Recent cost volume pyramid based deep neural networks have unlocked the potential of efficiently leveraging high-resolution images for depth inference from multi-view stereo.
In general, those approaches assume that the depth of each pixel follows a unimodal distribution.
We propose constructing the cost volume by non-parametric depth distribution modeling to handle pixels with unimodal and multi-modal distributions.
arXiv Detail & Related papers (2022-05-08T05:13:04Z) - A Confidence-based Iterative Solver of Depths and Surface Normals for
Deep Multi-view Stereo [41.527018997251744]
We introduce a deep multi-view stereo (MVS) system that jointly predicts depths, surface normals and per-view confidence maps.
The key to our approach is a novel solver that iteratively solves for per-view depth map and normal map.
Our proposed solver consistently improves the depth quality over both conventional and deep learning based MVS pipelines.
arXiv Detail & Related papers (2022-01-19T14:08:45Z) - SurfEmb: Dense and Continuous Correspondence Distributions for Object
Pose Estimation with Learnt Surface Embeddings [2.534402217750793]
We present an approach to learn dense, continuous 2D-3D correspondence distributions over the surface of objects from data.
We also present a new method for 6D pose estimation of rigid objects using the learnt distributions to sample, score and refine pose hypotheses.
arXiv Detail & Related papers (2021-11-26T13:39:38Z) - SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures.
Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities.
We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z) - PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and
Localization [64.39761523935613]
We present a new framework for Patch Distribution Modeling, PaDiM, to concurrently detect and localize anomalies in images.
PaDiM makes use of a pretrained convolutional neural network (CNN) for patch embedding.
It also exploits correlations between the different semantic levels of CNN to better localize anomalies.
arXiv Detail & Related papers (2020-11-17T17:29:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.