Deep Multi-View Stereo gone wild
- URL: http://arxiv.org/abs/2104.15119v1
- Date: Fri, 30 Apr 2021 17:07:17 GMT
- Title: Deep Multi-View Stereo gone wild
- Authors: Fran\c{c}ois Darmon and B\'en\'edicte Bascle and Jean-Cl\'ement Devaux
and Pascal Monasse and Mathieu Aubry
- Abstract summary: Deep multi-view stereo (deep MVS) methods have been developed and extensively compared on simple datasets.
In this paper, we ask whether the conclusions reached in controlled scenarios are still valid when working with Internet photo collections.
We propose a methodology for evaluation and explore the influence of three aspects of deep MVS methods: network architecture, training data, and supervision.
- Score: 12.106051690920266
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep multi-view stereo (deep MVS) methods have been developed and extensively
compared on simple datasets, where they now outperform classical approaches. In
this paper, we ask whether the conclusions reached in controlled scenarios are
still valid when working with Internet photo collections. We propose a
methodology for evaluation and explore the influence of three aspects of deep
MVS methods: network architecture, training data, and supervision. We make
several key observations, which we extensively validate quantitatively and
qualitatively, both for depth prediction and complete 3D reconstructions.
First, we outline the promises of unsupervised techniques by introducing a
simple approach which provides more complete reconstructions than supervised
options when using a simple network architecture. Second, we emphasize that not
all multiscale architectures generalize to the unconstrained scenario,
especially without supervision. Finally, we show the efficiency of noisy
supervision from large-scale 3D reconstructions which can even lead to networks
that outperform classical methods in scenarios where very few images are
available.
Related papers
- Learning-based Multi-View Stereo: A Survey [55.3096230732874]
Multi-View Stereo (MVS) algorithms synthesize a comprehensive 3D representation, enabling precise reconstruction in complex environments.
With the success of deep learning, many learning-based MVS methods have been proposed, achieving impressive performance against traditional methods.
arXiv Detail & Related papers (2024-08-27T17:53:18Z) - OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - Multi-View Guided Multi-View Stereo [39.116228971420874]
This paper introduces a novel deep framework for dense 3D reconstruction from multiple image frames.
Given a deep multi-view stereo network, our framework uses sparse depth hints to guide the neural network.
We evaluate our Multi-View Guided framework within a variety of state-of-the-art deep multi-view stereo networks.
arXiv Detail & Related papers (2022-10-20T17:59:18Z) - End-to-End Multi-View Structure-from-Motion with Hypercorrelation
Volumes [7.99536002595393]
Deep learning techniques have been proposed to tackle this problem.
We improve on the state-of-the-art two-view structure-from-motion(SfM) approach.
We extend it to the general multi-view case and evaluate it on the complex benchmark dataset DTU.
arXiv Detail & Related papers (2022-09-14T20:58:44Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Reversing the cycle: self-supervised deep stereo through enhanced
monocular distillation [51.714092199995044]
In many fields, self-supervised learning solutions are rapidly evolving and filling the gap with supervised approaches.
We propose a novel self-supervised paradigm reversing the link between the two.
In order to train deep stereo networks, we distill knowledge through a monocular completion network.
arXiv Detail & Related papers (2020-08-17T07:40:22Z) - Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [36.414471128890284]
We tackle the essential problem of scale inconsistency for self-supervised joint depth-pose learning.
Most existing methods assume that a consistent scale of depth and pose can be learned across all input samples.
We propose a novel system that explicitly disentangles scale from the network estimation.
arXiv Detail & Related papers (2020-04-03T00:28:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.