Revisiting PatchMatch Multi-View Stereo for Urban 3D Reconstruction
- URL: http://arxiv.org/abs/2207.08439v1
- Date: Mon, 18 Jul 2022 08:45:54 GMT
- Title: Revisiting PatchMatch Multi-View Stereo for Urban 3D Reconstruction
- Authors: Marco Orsingher, Paolo Zani, Paolo Medici, Massimo Bertozzi
- Abstract summary: A complete pipeline for image-based 3D reconstruction of urban scenarios is proposed, based on PatchMatch Multi-View Stereo (MVS)
The proposed approach is carefully evaluated against both classical MVS algorithms and monocular depth networks on the KITTI dataset.
- Score: 1.1011268090482573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, a complete pipeline for image-based 3D reconstruction of urban
scenarios is proposed, based on PatchMatch Multi-View Stereo (MVS). Input
images are firstly fed into an off-the-shelf visual SLAM system to extract
camera poses and sparse keypoints, which are used to initialize PatchMatch
optimization. Then, pixelwise depths and normals are iteratively computed in a
multi-scale framework with a novel depth-normal consistency loss term and a
global refinement algorithm to balance the inherently local nature of
PatchMatch. Finally, a large-scale point cloud is generated by back-projecting
multi-view consistent estimates in 3D. The proposed approach is carefully
evaluated against both classical MVS algorithms and monocular depth networks on
the KITTI dataset, showing state of the art performances.
Related papers
- GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - SimpleMapping: Real-Time Visual-Inertial Dense Mapping with Deep
Multi-View Stereo [13.535871843518953]
We present a real-time visual-inertial dense mapping method with high quality using only monocular images and IMU readings.
We propose a sparse point aided stereo neural network (SPA-MVSNet) that can effectively leverage the informative but noisy sparse points from the VIO system.
Our proposed dense mapping system achieves a 39.7% improvement in F-score over existing systems when evaluated on the challenging scenarios of the EuRoC dataset.
arXiv Detail & Related papers (2023-06-14T17:28:45Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - 3D Surface Reconstruction From Multi-Date Satellite Images [11.84274417463238]
We propose an extension of Structure from Motion (SfM) based pipeline that allows us to reconstruct point clouds from multiple satellite images.
We provide a detailed description of several steps that are mandatory to exploit state-of-the-art mesh reconstruction algorithms in the context of satellite imagery.
We show that the proposed pipeline combined with current meshing algorithms outperforms state-of-the-art point cloud reconstruction algorithms in terms of completeness and median error.
arXiv Detail & Related papers (2021-02-04T09:23:21Z) - Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D
Reconstruction with Symmetry [12.511526058118143]
We propose a sampling scheme that theoretically encourages generalization and results in fast convergence for SGD-based optimization algorithms.
Based on the reflective symmetry of an object, we propose a feature fusion method that alleviates issues due to self-occlusions.
Our proposed system Ladybird is able to create high quality 3D object reconstructions from a single input image.
arXiv Detail & Related papers (2020-07-27T09:17:00Z) - Multi view stereo with semantic priors [3.756550107432323]
We aim to support the standard dense 3D reconstruction of scenes as implemented in the open source library OpenMVS by using semantic priors.
We impose extra semantic constraints in order to remove possible errors and selectively obtain segmented point clouds per label.
arXiv Detail & Related papers (2020-07-05T11:30:29Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.