PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network
- URL: http://arxiv.org/abs/2007.07714v1
- Date: Wed, 15 Jul 2020 14:39:49 GMT
- Title: PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network
- Authors: Qingshan Xu and Wenbing Tao
- Abstract summary: A Pixelwise Visibility-aware multi-view Stereo Network (PVSNet) is proposed for robust dense 3D reconstruction.
PVSNet is the first deep learning framework that is able to capture the visibility information of different neighboring views.
Experiments show that PVSNet achieves the state-of-the-art performance on different datasets.
- Score: 32.41293572426403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, learning-based multi-view stereo methods have achieved promising
results. However, they all overlook the visibility difference among different
views, which leads to an indiscriminate multi-view similarity definition and
greatly limits their performance on datasets with strong viewpoint variations.
In this paper, a Pixelwise Visibility-aware multi-view Stereo Network (PVSNet)
is proposed for robust dense 3D reconstruction. We present a pixelwise
visibility network to learn the visibility information for different
neighboring images before computing the multi-view similarity, and then
construct an adaptive weighted cost volume with the visibility information.
Moreover, we present an anti-noise training strategy that introduces disturbing
views during model training to make the pixelwise visibility network more
distinguishable to unrelated views, which is different with the existing
learning methods that only use two best neighboring views for training. To the
best of our knowledge, PVSNet is the first deep learning framework that is able
to capture the visibility information of different neighboring views. In this
way, our method can be generalized well to different types of datasets,
especially the ETH3D high-res benchmark with strong viewpoint variations.
Extensive experiments show that PVSNet achieves the state-of-the-art
performance on different datasets.
Related papers
- Learning-based Multi-View Stereo: A Survey [55.3096230732874]
Multi-View Stereo (MVS) algorithms synthesize a comprehensive 3D representation, enabling precise reconstruction in complex environments.
With the success of deep learning, many learning-based MVS methods have been proposed, achieving impressive performance against traditional methods.
arXiv Detail & Related papers (2024-08-27T17:53:18Z) - Visibility-Aware Pixelwise View Selection for Multi-View Stereo Matching [9.915386906818485]
We propose a novel visibility-guided pixelwise view selection scheme.
It progressively refines the set of source views to be used for each pixel in the reference view.
In addition, the Artificial Multi-Bee Colony algorithm is employed to search for optimal solutions for different pixels in parallel.
arXiv Detail & Related papers (2023-02-14T16:50:03Z) - MVTN: Learning Multi-View Transformations for 3D Understanding [60.15214023270087]
We introduce the Multi-View Transformation Network (MVTN), which uses differentiable rendering to determine optimal view-points for 3D shape recognition.
MVTN can be trained end-to-end with any multi-view network for 3D shape recognition.
Our approach demonstrates state-of-the-art performance in 3D classification and shape retrieval on several benchmarks.
arXiv Detail & Related papers (2022-12-27T12:09:16Z) - Peripheral Vision Transformer [52.55309200601883]
We take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition.
We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.
We evaluate the proposed network, dubbed PerViT, on the large-scale ImageNet dataset and systematically investigate the inner workings of the model for machine perception.
arXiv Detail & Related papers (2022-06-14T12:47:47Z) - Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding [80.04281842702294]
We introduce the concept of the multi-view point cloud (Voint cloud) representing each 3D point as a set of features extracted from several view-points.
This novel 3D Voint cloud representation combines the compactness of 3D point cloud representation with the natural view-awareness of multi-view representation.
We deploy a Voint neural network (VointNet) with a theoretically established functional form to learn representations in the Voint space.
arXiv Detail & Related papers (2021-11-30T13:08:19Z) - Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds
of Large Scenes with Learned Virtual View Visibility [17.929307870456416]
We present a novel framework for mesh reconstruction from unstructured point clouds.
We take advantage of the learned visibility of the 3D points in the virtual views and traditional graph-cut based mesh generation.
arXiv Detail & Related papers (2021-08-18T20:28:16Z) - Weak Multi-View Supervision for Surface Mapping Estimation [0.9367260794056769]
We propose a weakly-supervised multi-view learning approach to learn category-specific surface mapping without dense annotations.
We learn the underlying surface geometry of common categories, such as human faces, cars, and airplanes, given instances from those categories.
arXiv Detail & Related papers (2021-05-04T09:46:26Z) - Contrastive Spatial Reasoning on Multi-View Line Drawings [11.102238863932255]
State-of-the-art supervised deep networks show puzzling low performances on the SPARE3D dataset.
We propose a simple contrastive learning approach along with other network modifications to improve the baseline performance.
Our approach uses a self-supervised binary classification network to compare the line drawing differences between various views of any two similar 3D objects.
arXiv Detail & Related papers (2021-04-27T19:05:27Z) - MVTN: Multi-View Transformation Network for 3D Shape Recognition [80.34385402179852]
We introduce the Multi-View Transformation Network (MVTN) that regresses optimal view-points for 3D shape recognition.
MVTN can be trained end-to-end along with any multi-view network for 3D shape classification.
MVTN exhibits clear performance gains in the tasks of 3D shape classification and 3D shape retrieval without the need for extra training supervision.
arXiv Detail & Related papers (2020-11-26T11:33:53Z) - Embedded Deep Bilinear Interactive Information and Selective Fusion for
Multi-view Learning [70.67092105994598]
We propose a novel multi-view learning framework to make the multi-view classification better aimed at the above-mentioned two aspects.
In particular, we train different deep neural networks to learn various intra-view representations.
Experiments on six publicly available datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-07-13T01:13:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.