SimpleRecon: 3D Reconstruction Without 3D Convolutions
- URL: http://arxiv.org/abs/2208.14743v1
- Date: Wed, 31 Aug 2022 09:46:34 GMT
- Title: SimpleRecon: 3D Reconstruction Without 3D Convolutions
- Authors: Mohamed Sayed, John Gibson, Jamie Watson, Victor Prisacariu, Michael
Firman, Cl\'ement Godard
- Abstract summary: We show how focusing on high quality multi-view depth prediction leads to highly accurate 3D reconstructions using simple off-the-shelf depth fusion.
Our method achieves a significant lead over the current state-of-the-art for depth estimation and close or better for 3D reconstruction on ScanNet and 7-Scenes.
- Score: 21.952478592241
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Traditionally, 3D indoor scene reconstruction from posed images happens in
two phases: per-image depth estimation, followed by depth merging and surface
reconstruction. Recently, a family of methods have emerged that perform
reconstruction directly in final 3D volumetric feature space. While these
methods have shown impressive reconstruction results, they rely on expensive 3D
convolutional layers, limiting their application in resource-constrained
environments. In this work, we instead go back to the traditional route, and
show how focusing on high quality multi-view depth prediction leads to highly
accurate 3D reconstructions using simple off-the-shelf depth fusion. We propose
a simple state-of-the-art multi-view depth estimator with two main
contributions: 1) a carefully-designed 2D CNN which utilizes strong image
priors alongside a plane-sweep feature volume and geometric losses, combined
with 2) the integration of keyframe and geometric metadata into the cost volume
which allows informed depth plane scoring. Our method achieves a significant
lead over the current state-of-the-art for depth estimation and close or better
for 3D reconstruction on ScanNet and 7-Scenes, yet still allows for online
real-time low-memory reconstruction. Code, models and results are available at
https://nianticlabs.github.io/simplerecon
Related papers
- Lightplane: Highly-Scalable Components for Neural 3D Fields [54.59244949629677]
Lightplane Render and Splatter significantly reduce memory usage in 2D-3D mapping.
These innovations enable the processing of vastly more and higher resolution images with small memory and computational costs.
arXiv Detail & Related papers (2024-04-30T17:59:51Z) - R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation.
Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement.
We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z) - Neural 3D Scene Reconstruction from Multiple 2D Images without 3D
Supervision [41.20504333318276]
We propose a novel neural reconstruction method that reconstructs scenes using sparse depth under the plane constraints without 3D supervision.
We introduce a signed distance function field, a color field, and a probability field to represent a scene.
We optimize these fields to reconstruct the scene by using differentiable ray marching with accessible 2D images as supervision.
arXiv Detail & Related papers (2023-06-30T13:30:48Z) - CVRecon: Rethinking 3D Geometric Feature Learning For Neural
Reconstruction [12.53249207602695]
We propose an end-to-end 3D neural reconstruction framework CVRecon.
We exploit the rich geometric embedding in the cost volumes to facilitate 3D geometric feature learning.
arXiv Detail & Related papers (2023-04-28T05:30:19Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Learning to Recover 3D Scene Shape from a Single Image [98.20106822614392]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape.
arXiv Detail & Related papers (2020-12-17T02:35:13Z) - DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors [37.01774224029594]
DI-Fusion is based on a novel 3D representation, i.e. Probabilistic Local Implicit Voxels (PLIVoxs)
PLIVox encodes scene priors considering both the local geometry and uncertainty parameterized by a deep neural network.
We are able to perform online implicit 3D reconstruction achieving state-of-the-art camera trajectory estimation accuracy and mapping quality.
arXiv Detail & Related papers (2020-12-10T09:46:35Z) - Improved Modeling of 3D Shapes with Multi-view Depth Maps [48.8309897766904]
We present a general-purpose framework for modeling 3D shapes using CNNs.
Using just a single depth image of the object, we can output a dense multi-view depth map representation of 3D objects.
arXiv Detail & Related papers (2020-09-07T17:58:27Z) - Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction [32.14605731030579]
3D reconstruction from a single RGB image is a challenging problem in computer vision.
Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability.
We present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry.
arXiv Detail & Related papers (2020-06-17T17:58:59Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.