GigaSLAM: Large-Scale Monocular SLAM with Hierachical Gaussian Splats
- URL: http://arxiv.org/abs/2503.08071v1
- Date: Tue, 11 Mar 2025 06:05:15 GMT
- Title: GigaSLAM: Large-Scale Monocular SLAM with Hierachical Gaussian Splats
- Authors: Kai Deng, Jian Yang, Shenlong Wang, Jin Xie,
- Abstract summary: We introduce GigaSLAM, the first NeRF/3DGS-based SLAM framework for large-scale, unbounded outdoor environments.<n>Our approach employs a hierarchical sparse voxel map representation, where Gaussians are decoded by neural networks at multiple levels of detail.<n>GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks.
- Score: 30.608403266769788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tracking and mapping in large-scale, unbounded outdoor environments using only monocular RGB input presents substantial challenges for existing SLAM systems. Traditional Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) SLAM methods are typically limited to small, bounded indoor settings. To overcome these challenges, we introduce GigaSLAM, the first NeRF/3DGS-based SLAM framework for kilometer-scale outdoor environments, as demonstrated on the KITTI and KITTI 360 datasets. Our approach employs a hierarchical sparse voxel map representation, where Gaussians are decoded by neural networks at multiple levels of detail. This design enables efficient, scalable mapping and high-fidelity viewpoint rendering across expansive, unbounded scenes. For front-end tracking, GigaSLAM utilizes a metric depth model combined with epipolar geometry and PnP algorithms to accurately estimate poses, while incorporating a Bag-of-Words-based loop closure mechanism to maintain robust alignment over long trajectories. Consequently, GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks, establishing a robust SLAM solution for large-scale, long-term scenarios, and significantly extending the applicability of Gaussian Splatting SLAM systems to unbounded outdoor environments.
Related papers
- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - VIGS SLAM: IMU-based Large-Scale 3D Gaussian Splatting SLAM [15.841609263723576]
We propose a novel 3D Gaussian Splatting SLAM method, VIGS SLAM, for large-scale indoor environments.<n>Our proposed method is the first to propose that Gaussian Splatting-based SLAM can be effectively performed in large-scale environments by integrating IMU sensor measurements.<n>This proposal not only enhances the performance of Gaussian Splatting SLAM beyond room-scale scenarios but also achieves SLAM performance comparable to state-of-the-art methods in large-scale indoor environments.
arXiv Detail & Related papers (2025-01-23T06:01:03Z) - VINGS-Mono: Visual-Inertial Gaussian Splatting Monocular SLAM in Large Scenes [10.287279799581544]
VINGS-Mono is a monocular (inertial) Gaussian Splatting (GS) SLAM framework designed for large scenes.<n>The framework comprises four main components: VIO Front End, 2D Gaussian Map, NVS Loop Closure, and Dynamic Eraser.
arXiv Detail & Related papers (2025-01-14T18:01:15Z) - MotionGS : Compact Gaussian Splatting SLAM by Motion Filter [10.979138131565238]
There has been a surge in NeRF-based SLAM, while 3DGS-based SLAM is sparse.
A novel 3DGS-based SLAM approach with a fusion of deep visual feature, dual selection and 3DGS is presented in this paper.
arXiv Detail & Related papers (2024-05-18T00:47:29Z) - MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM [53.6402869027093]
We propose an efficient RGB-only dense SLAM system using a flexible neural point cloud representation scene.
We also introduce a novel DSPO layer for bundle adjustment which optimize the pose and depth of implicits along with the scale of the monocular depth.
arXiv Detail & Related papers (2024-03-28T16:32:06Z) - CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field [46.8198987091734]
This paper presents an efficient dense RGB-D SLAM system, i.e., CG-SLAM, based on a novel uncertainty-aware 3D Gaussian field.
Experiments on various datasets demonstrate that CG-SLAM achieves superior tracking and mapping performance with a notable tracking speed of up to 15 Hz.
arXiv Detail & Related papers (2024-03-24T11:19:59Z) - MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction [2.3630527334737104]
MoD-SLAM is the first monocular NeRF-based dense mapping method that allows 3D reconstruction in real-time in unbounded scenes.
By introducing a robust depth loss term into the tracking process, our SLAM system achieves more precise pose estimation in large-scale scenes.
Our experiments on two standard datasets show that MoD-SLAM achieves competitive performance, improving the accuracy of the 3D reconstruction and localization by up to 30% and 15% respectively.
arXiv Detail & Related papers (2024-02-06T07:07:33Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [112.6093688226293]
NICE-SLAM is a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation.
Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust.
arXiv Detail & Related papers (2021-12-22T18:45:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.