NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video
- URL: http://arxiv.org/abs/2104.00681v1
- Date: Thu, 1 Apr 2021 17:59:46 GMT
- Title: NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video
- Authors: Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, Hujun Bao
- Abstract summary: We propose to reconstruct local surfaces represented as sparse TSDF volumes for each video fragment sequentially by a neural network.
A learning-based TSDF fusion module is used to guide the network to fuse features from previous fragments.
Experiments on ScanNet and 7-Scenes datasets show that our system outperforms state-of-the-art methods in terms of both accuracy and speed.
- Score: 41.554961144321474
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel framework named NeuralRecon for real-time 3D scene
reconstruction from a monocular video. Unlike previous methods that estimate
single-view depth maps separately on each key-frame and fuse them later, we
propose to directly reconstruct local surfaces represented as sparse TSDF
volumes for each video fragment sequentially by a neural network. A
learning-based TSDF fusion module based on gated recurrent units is used to
guide the network to fuse features from previous fragments. This design allows
the network to capture local smoothness prior and global shape prior of 3D
surfaces when sequentially reconstructing the surfaces, resulting in accurate,
coherent, and real-time surface reconstruction. The experiments on ScanNet and
7-Scenes datasets show that our system outperforms state-of-the-art methods in
terms of both accuracy and speed. To the best of our knowledge, this is the
first learning-based system that is able to reconstruct dense coherent 3D
geometry in real-time.
Related papers
- SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - SimpleMapping: Real-Time Visual-Inertial Dense Mapping with Deep
Multi-View Stereo [13.535871843518953]
We present a real-time visual-inertial dense mapping method with high quality using only monocular images and IMU readings.
We propose a sparse point aided stereo neural network (SPA-MVSNet) that can effectively leverage the informative but noisy sparse points from the VIO system.
Our proposed dense mapping system achieves a 39.7% improvement in F-score over existing systems when evaluated on the challenging scenarios of the EuRoC dataset.
arXiv Detail & Related papers (2023-06-14T17:28:45Z) - FineRecon: Depth-aware Feed-forward Network for Detailed 3D
Reconstruction [13.157400338544177]
Recent works on 3D reconstruction from posed images have demonstrated that direct inference of scene-level 3D geometry is feasible using deep neural networks.
We propose three effective solutions for improving the fidelity of inference-based 3D reconstructions.
Our method, FineRecon, produces smooth and highly accurate reconstructions, showing significant improvements across multiple depth and 3D reconstruction metrics.
arXiv Detail & Related papers (2023-04-04T02:50:29Z) - VolRecon: Volume Rendering of Signed Ray Distance Functions for
Generalizable Multi-View Reconstruction [64.09702079593372]
VolRecon is a novel generalizable implicit reconstruction method with Signed Ray Distance Function (SRDF)
On DTU dataset, VolRecon outperforms SparseNeuS by about 30% in sparse view reconstruction and achieves comparable accuracy as MVSNet in full view reconstruction.
arXiv Detail & Related papers (2022-12-15T18:59:54Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed
Monocular Videos [32.286637700503995]
PlanarRecon is a framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video.
A learning-based tracking and fusion module is designed to merge planes from previous fragments to form a coherent global plane reconstruction.
Experiments show that the proposed approach achieves state-of-the-art performances on the ScanNet dataset while being real-time.
arXiv Detail & Related papers (2022-06-15T17:59:16Z) - Learnable Triangulation for Deep Learning-based 3D Reconstruction of
Objects of Arbitrary Topology from Single RGB Images [12.693545159861857]
We propose a novel deep reinforcement learning-based approach for 3D object reconstruction from monocular images.
The proposed method outperforms the state-of-the-art in terms of visual quality, reconstruction accuracy, and computational time.
arXiv Detail & Related papers (2021-09-24T09:44:22Z) - Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face
Reconstruction [76.1612334630256]
We harness the power of Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs) to reconstruct the facial texture and shape from single images.
We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, facial texture reconstruction with high-frequency details.
arXiv Detail & Related papers (2021-05-16T16:35:44Z) - Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D
Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs.
We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.