PLGSLAM: Progressive Neural Scene Represenation with Local to Global Bundle Adjustment
- URL: http://arxiv.org/abs/2312.09866v2
- Date: Fri, 29 Mar 2024 08:25:36 GMT
- Title: PLGSLAM: Progressive Neural Scene Represenation with Local to Global Bundle Adjustment
- Authors: Tianchen Deng, Guole Shen, Tong Qin, Jianyu Wang, Wentao Zhao, Jingchuan Wang, Danwei Wang, Weidong Chen,
- Abstract summary: We introduce PLGSLAM, a neural visual SLAM system capable of high-fidelity surface reconstruction and robust camera tracking in real-time.
We show that PLGSLAM achieves state-of-the-art scene reconstruction results and tracking performance across various datasets and scenarios.
- Score: 24.05634277422078
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Neural implicit scene representations have recently shown encouraging results in dense visual SLAM. However, existing methods produce low-quality scene reconstruction and low-accuracy localization performance when scaling up to large indoor scenes and long sequences. These limitations are mainly due to their single, global radiance field with finite capacity, which does not adapt to large scenarios. Their end-to-end pose networks are also not robust enough with the growth of cumulative errors in large scenes. To this end, we introduce PLGSLAM, a neural visual SLAM system capable of high-fidelity surface reconstruction and robust camera tracking in real-time. To handle large-scale indoor scenes, PLGSLAM proposes a progressive scene representation method which dynamically allocates new local scene representation trained with frames within a local sliding window. This allows us to scale up to larger indoor scenes and improves robustness (even under pose drifts). In local scene representation, PLGSLAM utilizes tri-planes for local high-frequency features with multi-layer perceptron (MLP) networks for the low-frequency feature, achieving smoothness and scene completion in unobserved areas. Moreover, we propose local-to-global bundle adjustment method with a global keyframe database to address the increased pose drifts on long sequences. Experimental results demonstrate that PLGSLAM achieves state-of-the-art scene reconstruction results and tracking performance across various datasets and scenarios (both in small and large-scale indoor environments).
Related papers
- Dynamic 3D Gaussian Fields for Urban Areas [60.64840836584623]
We present an efficient neural 3D scene representation for novel-view synthesis (NVS) in large-scale, dynamic urban areas.
We propose 4DGF, a neural scene representation that scales to large-scale dynamic urban areas.
arXiv Detail & Related papers (2024-06-05T12:07:39Z) - MUTE-SLAM: Real-Time Neural SLAM with Multiple Tri-Plane Hash Representations [6.266208986510979]
MUTE-SLAM is a real-time neural RGB-D SLAM system employing multiple tri-plane hash-encodings for efficient scene representation.
MUTE-SLAM effectively tracks camera positions and incrementally builds a scalable multi-map representation for both small and large indoor environments.
arXiv Detail & Related papers (2024-03-26T14:53:24Z) - Global-guided Focal Neural Radiance Field for Large-scale Scene Rendering [12.272724419136575]
We present a global-guided focal neural radiance field (GF-NeRF) that achieves high-fidelity rendering of large-scale scenes.
Our method achieves high-fidelity, natural rendering results on various types of large-scale datasets.
arXiv Detail & Related papers (2024-03-19T15:45:54Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids [84.90863397388776]
We propose to directly use signed distance function (SDF) in sparse voxel block grids for fast and accurate scene reconstruction without distances.
Our globally sparse and locally dense data structure exploits surfaces' spatial sparsity, enables cache-friendly queries, and allows direct extensions to multi-modal data.
Experiments show that our approach is 10x faster in training and 100x faster in rendering while achieving comparable accuracy to state-of-the-art neural implicit methods.
arXiv Detail & Related papers (2023-05-22T16:50:19Z) - Grid-guided Neural Radiance Fields for Large Urban Scenes [146.06368329445857]
Recent approaches propose to geographically divide the scene and adopt multiple sub-NeRFs to model each region individually.
An alternative solution is to use a feature grid representation, which is computationally efficient and can naturally scale to a large scene.
We present a new framework that realizes high-fidelity rendering on large urban scenes while being computationally efficient.
arXiv Detail & Related papers (2023-03-24T13:56:45Z) - NEWTON: Neural View-Centric Mapping for On-the-Fly Large-Scale SLAM [51.21564182169607]
Newton is a view-centric mapping method that dynamically constructs neural fields based on run-time observation.
Our method enables camera pose updates using loop closures and scene boundary updates by representing the scene with multiple neural fields.
The experimental results demonstrate the superior performance of our method over existing world-centric neural field-based SLAM systems.
arXiv Detail & Related papers (2023-03-23T20:22:01Z) - PLD-SLAM: A Real-Time Visual SLAM Using Points and Line Segments in
Dynamic Scenes [0.0]
This paper proposes a real-time stereo indirect visual SLAM system, PLD-SLAM, which combines point and line features.
We also present a novel global gray similarity (GGS) algorithm to achieve reasonable selection and efficient loop closure detection.
arXiv Detail & Related papers (2022-07-22T07:40:00Z) - NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [112.6093688226293]
NICE-SLAM is a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation.
Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust.
arXiv Detail & Related papers (2021-12-22T18:45:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.