DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
- URL: http://arxiv.org/abs/2012.05551v2
- Date: Fri, 26 Mar 2021 14:51:43 GMT
- Title: DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
- Authors: Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, Shi-Min Hu
- Abstract summary: DI-Fusion is based on a novel 3D representation, i.e. Probabilistic Local Implicit Voxels (PLIVoxs)
PLIVox encodes scene priors considering both the local geometry and uncertainty parameterized by a deep neural network.
We are able to perform online implicit 3D reconstruction achieving state-of-the-art camera trajectory estimation accuracy and mapping quality.
- Score: 37.01774224029594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous online 3D dense reconstruction methods struggle to achieve the
balance between memory storage and surface quality, largely due to the usage of
stagnant underlying geometry representation, such as TSDF (truncated signed
distance functions) or surfels, without any knowledge of the scene priors. In
this paper, we present DI-Fusion (Deep Implicit Fusion), based on a novel 3D
representation, i.e. Probabilistic Local Implicit Voxels (PLIVoxs), for online
3D reconstruction with a commodity RGB-D camera. Our PLIVox encodes scene
priors considering both the local geometry and uncertainty parameterized by a
deep neural network. With such deep priors, we are able to perform online
implicit 3D reconstruction achieving state-of-the-art camera trajectory
estimation accuracy and mapping quality, while achieving better storage
efficiency compared with previous online 3D reconstruction approaches. Our
implementation is available at https://www.github.com/huangjh-pub/di-fusion.
Related papers
- DoubleTake: Geometry Guided Depth Estimation [17.464549832122714]
Estimating depth from a sequence of posed RGB images is a fundamental computer vision task.
We introduce a reconstruction which combines volume features with a hint of the prior geometry, rendered as a depth map from the current camera location.
We demonstrate that our method can run at interactive speeds, state-of-the-art estimates of depth and 3D scene in both offline and incremental evaluation scenarios.
arXiv Detail & Related papers (2024-06-26T14:29:05Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - FineRecon: Depth-aware Feed-forward Network for Detailed 3D
Reconstruction [13.157400338544177]
Recent works on 3D reconstruction from posed images have demonstrated that direct inference of scene-level 3D geometry is feasible using deep neural networks.
We propose three effective solutions for improving the fidelity of inference-based 3D reconstructions.
Our method, FineRecon, produces smooth and highly accurate reconstructions, showing significant improvements across multiple depth and 3D reconstruction metrics.
arXiv Detail & Related papers (2023-04-04T02:50:29Z) - SimpleRecon: 3D Reconstruction Without 3D Convolutions [21.952478592241]
We show how focusing on high quality multi-view depth prediction leads to highly accurate 3D reconstructions using simple off-the-shelf depth fusion.
Our method achieves a significant lead over the current state-of-the-art for depth estimation and close or better for 3D reconstruction on ScanNet and 7-Scenes.
arXiv Detail & Related papers (2022-08-31T09:46:34Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Deep Hybrid Self-Prior for Full 3D Mesh Generation [57.78562932397173]
We propose to exploit a novel hybrid 2D-3D self-prior in deep neural networks to significantly improve the geometry quality.
In particular, we first generate an initial mesh using a 3D convolutional neural network with 3D self-prior, and then encode both 3D information and color information in the 2D UV atlas.
Our method recovers the 3D textured mesh model of high quality from sparse input, and outperforms the state-of-the-art methods in terms of both the geometry and texture quality.
arXiv Detail & Related papers (2021-08-18T07:44:21Z) - 3D Human Mesh Regression with Dense Correspondence [95.92326689172877]
Estimating 3D mesh of the human body from a single 2D image is an important task with many applications such as augmented reality and Human-Robot interaction.
Prior works reconstructed 3D mesh from global image feature extracted by using convolutional neural network (CNN), where the dense correspondences between the mesh surface and the image pixels are missing.
This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space.
arXiv Detail & Related papers (2020-06-10T08:50:53Z) - Atlas: End-to-End 3D Scene Reconstruction from Posed Images [13.154808583020229]
We present an end-to-end 3D reconstruction method for a scene by directly regressing a truncated signed distance function (TSDF) from a set of posed RGB images.
A 2D CNN extracts features from each image independently which are then back-projected and accumulated into a voxel volume.
A 3D CNN refines the accumulated features and predicts the TSDF values.
arXiv Detail & Related papers (2020-03-23T17:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.