MeshMVS: Multi-View Stereo Guided Mesh Reconstruction
- URL: http://arxiv.org/abs/2010.08682v3
- Date: Sun, 11 Apr 2021 16:14:31 GMT
- Title: MeshMVS: Multi-View Stereo Guided Mesh Reconstruction
- Authors: Rakesh Shrestha, Zhiwen Fan, Qingkun Su, Zuozhuo Dai, Siyu Zhu, Ping
Tan
- Abstract summary: Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects.
We propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo.
We achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.
- Score: 35.763452474239955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning based 3D shape generation methods generally utilize latent
features extracted from color images to encode the semantics of objects and
guide the shape generation process. These color image semantics only implicitly
encode 3D information, potentially limiting the accuracy of the generated
shapes. In this paper we propose a multi-view mesh generation method which
incorporates geometry information explicitly by using the features from
intermediate depth representations of multi-view stereo and regularizing the 3D
shapes against these depth images. First, our system predicts a coarse 3D
volume from the color images by probabilistically merging voxel occupancy grids
from the prediction of individual views. Then the depth images from multi-view
stereo along with the rendered depth images of the coarse shape are used as a
contrastive input whose features guide the refinement of the coarse shape
through a series of graph convolution networks. Notably, we achieve superior
results than state-of-the-art multi-view shape generation methods with 34%
decrease in Chamfer distance to ground truth and 14% increase in F1-score on
ShapeNet dataset.Our source code is available at https://git.io/Jmalg
Related papers
- MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - FlexiDreamer: Single Image-to-3D Generation with FlexiCubes [20.871847154995688]
FlexiDreamer is a novel framework that directly reconstructs high-quality meshes from multi-view generated images.
Our approach can generate high-fidelity 3D meshes in the single image-to-3D downstream task with approximately 1 minute.
arXiv Detail & Related papers (2024-04-01T08:20:18Z) - Efficient Textured Mesh Recovery from Multiple Views with Differentiable
Rendering [8.264851594332677]
We propose an efficient coarse-to-fine approach to recover the textured mesh from multi-view images.
We optimize the shape geometry by minimizing the difference between the rendered mesh with the depth predicted by the learning-based multi-view stereo algorithm.
In contrast to the implicit neural representation on shape and color, we introduce a physically based inverse rendering scheme to jointly estimate the lighting and reflectance of the objects.
arXiv Detail & Related papers (2022-05-25T03:33:55Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape [77.95154911528365]
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori.
Previous reconstructed 3D faces suffer from degraded visual verisimilitude due to the loss of fine-grained geometry.
This paper proposes a complete solution to capture the personalized shape so that the reconstructed shape looks identical to the corresponding person.
arXiv Detail & Related papers (2022-04-09T03:46:18Z) - Implicit Neural Deformation for Multi-View Face Reconstruction [43.88676778013593]
We present a new method for 3D face reconstruction from multi-view RGB images.
Unlike previous methods which are built upon 3D morphable models, our method leverages an implicit representation to encode rich geometric features.
Our experimental results on several benchmark datasets demonstrate that our approach outperforms alternative baselines and achieves superior face reconstruction results compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-05T07:02:53Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.