Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids
- URL: http://arxiv.org/abs/2305.13220v1
- Date: Mon, 22 May 2023 16:50:19 GMT
- Title: Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids
- Authors: Wei Dong, Chris Choy, Charles Loop, Or Litany, Yuke Zhu, Anima
Anandkumar
- Abstract summary: We propose to directly use signed distance function (SDF) in sparse voxel block grids for fast and accurate scene reconstruction without distances.
Our globally sparse and locally dense data structure exploits surfaces' spatial sparsity, enables cache-friendly queries, and allows direct extensions to multi-modal data.
Experiments show that our approach is 10x faster in training and 100x faster in rendering while achieving comparable accuracy to state-of-the-art neural implicit methods.
- Score: 84.90863397388776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Indoor scene reconstruction from monocular images has long been sought after
by augmented reality and robotics developers. Recent advances in neural field
representations and monocular priors have led to remarkable results in
scene-level surface reconstructions. The reliance on Multilayer Perceptrons
(MLP), however, significantly limits speed in training and rendering. In this
work, we propose to directly use signed distance function (SDF) in sparse voxel
block grids for fast and accurate scene reconstruction without MLPs. Our
globally sparse and locally dense data structure exploits surfaces' spatial
sparsity, enables cache-friendly queries, and allows direct extensions to
multi-modal data such as color and semantic labels. To apply this
representation to monocular scene reconstruction, we develop a scale
calibration algorithm for fast geometric initialization from monocular depth
priors. We apply differentiable volume rendering from this initialization to
refine details with fast convergence. We also introduce efficient
high-dimensional Continuous Random Fields (CRFs) to further exploit the
semantic-geometry consistency between scene objects. Experiments show that our
approach is 10x faster in training and 100x faster in rendering while achieving
comparable accuracy to state-of-the-art neural implicit methods.
Related papers
- NeuV-SLAM: Fast Neural Multiresolution Voxel Optimization for RGBD Dense
SLAM [5.709880146357355]
We introduce NeuV-SLAM, a novel simultaneous localization and mapping pipeline based on neural multiresolution voxels.
NeuV-SLAM is characterized by ultra-fast convergence and incremental expansion capabilities.
arXiv Detail & Related papers (2024-02-03T04:26:35Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - HI-SLAM: Monocular Real-time Dense Mapping with Hybrid Implicit Fields [11.627951040865568]
Recent neural mapping frameworks show promising results, but rely on RGB-D or pose inputs, or cannot run in real-time.
Our approach integrates dense-SLAM with neural implicit fields.
For efficient construction of neural fields, we employ multi-resolution grid encoding and signed distance function.
arXiv Detail & Related papers (2023-10-07T12:26:56Z) - Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene
Reconstruction [29.83056271799794]
Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering.
We propose a deformable 3D Gaussians Splatting method that reconstructs scenes using 3D Gaussians and learns them in canonical space.
Through a differential Gaussianizer, the deformable 3D Gaussians not only achieve higher rendering quality but also real-time rendering speed.
arXiv Detail & Related papers (2023-09-22T16:04:02Z) - Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering [84.37776381343662]
Mip-NeRF proposes a multiscale representation as a conical frustum to encode scale information.
We propose mip voxel grids (Mip-VoG), an explicit multiscale representation for real-time anti-aliasing rendering.
Our approach is the first to offer multiscale training and real-time anti-aliasing rendering simultaneously.
arXiv Detail & Related papers (2023-04-20T04:05:22Z) - Fast Non-Rigid Radiance Fields from Monocularized Data [66.74229489512683]
This paper proposes a new method for full 360deg inward-facing novel view synthesis of non-rigidly deforming scenes.
At the core of our method are 1) An efficient deformation module that decouples the processing of spatial and temporal information for accelerated training and inference; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field.
In both cases, our method is significantly faster than previous methods, converging in less than 7 minutes and achieving real-time framerates at 1K resolution, while obtaining a higher visual accuracy for generated novel views.
arXiv Detail & Related papers (2022-12-02T18:51:10Z) - NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric
Mapping [29.3378360000956]
We present a novel 3D mapping method leveraging the recent progress in neural implicit representation for 3D reconstruction.
We propose a fusion strategy and training pipeline to incrementally build and update neural implicit representations.
We show that incrementally built occupancy maps can be obtained in real-time even on a CPU.
arXiv Detail & Related papers (2021-10-18T15:45:05Z) - Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D
Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs.
We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.