VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes
- URL: http://arxiv.org/abs/2505.18992v1
- Date: Sun, 25 May 2025 06:27:29 GMT
- Title: VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes
- Authors: Tianchen Deng, Wenhua Wu, Junjie He, Yue Pan, Xirui Jiang, Shenghai Yuan, Danwei Wang, Hesheng Wang, Weidong Chen,
- Abstract summary: VPGS-SLAM is the first 3DGS-based large-scale RGBD SLAM framework for both indoor and outdoor scenarios.<n>We design a novel voxel-based progressive 3D Gaussian mapping method with multiple submaps for compact and accurate scene representation.<n>In addition, we propose a 2D-3D fusion camera tracking method to achieve robust and accurate camera tracking in both indoor and outdoor large-scale scenes.
- Score: 26.06908154350295
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: 3D Gaussian Splatting has recently shown promising results in dense visual SLAM. However, existing 3DGS-based SLAM methods are all constrained to small-room scenarios and struggle with memory explosion in large-scale scenes and long sequences. To this end, we propose VPGS-SLAM, the first 3DGS-based large-scale RGBD SLAM framework for both indoor and outdoor scenarios. We design a novel voxel-based progressive 3D Gaussian mapping method with multiple submaps for compact and accurate scene representation in large-scale and long-sequence scenes. This allows us to scale up to arbitrary scenes and improves robustness (even under pose drifts). In addition, we propose a 2D-3D fusion camera tracking method to achieve robust and accurate camera tracking in both indoor and outdoor large-scale scenes. Furthermore, we design a 2D-3D Gaussian loop closure method to eliminate pose drift. We further propose a submap fusion method with online distillation to achieve global consistency in large-scale scenes when detecting a loop. Experiments on various indoor and outdoor datasets demonstrate the superiority and generalizability of the proposed framework. The code will be open source on https://github.com/dtc111111/vpgs-slam.
Related papers
- Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps [13.325879149065008]
3D Gaussian Splatting (3DGS) has become a popular solution in SLAM due to its high-fidelity synthesis and real-time novel view performance.<n>Previous 3DGS SLAM methods employ a differentiable rendering pipeline for tracking, lack geometric priors in outdoor scenes.<n>We propose a robust RGB-only outdoor 3DGS SLAM method: S3PO-GS. Technically, we establish a self-consistent tracking module anchored in the 3DGS pointmap, which avoids cumulative scale drift and achieves more precise and robust tracking with fewer iterations.
arXiv Detail & Related papers (2025-07-04T17:56:43Z) - MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation [51.07118703442774]
Existing NeRF-based multi-agent SLAM frameworks cannot meet the constraints of communication bandwidth.<n>We propose the first distributed multi-agent collaborative neural SLAM framework with hybrid scene representation.<n>A novel triplane-grid joint scene representation method is proposed to improve scene reconstruction.<n>A novel intra-to-inter loop closure method is designed to achieve local (single-agent) and global (multi-agent) consistency.
arXiv Detail & Related papers (2025-06-23T14:22:29Z) - GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats [20.98774763235796]
We introduce GigaSLAM, the first RGB NeRF / 3DGS-based SLAM framework for large-scale, unbounded outdoor environments.<n>Our approach employs a hierarchical sparse voxel map representation, where Gaussians are decoded by neural networks at multiple levels of detail.<n>GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks.
arXiv Detail & Related papers (2025-03-11T06:05:15Z) - RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes [12.150995604820443]
3D Gaussian Splatting (3DGS) has become a popular solution in SLAM, as it can produce high-fidelity novel views.<n>Previous GS-based methods primarily target indoor scenes and rely on RGB-D sensors or pre-trained depth estimation models.<n>We propose a RGB-only gaussian splatting SLAM method for unbounded outdoor scenes--OpenGS-SLAM.
arXiv Detail & Related papers (2025-02-21T18:02:31Z) - GaussRender: Learning 3D Occupancy with Gaussian Rendering [86.89653628311565]
GaussRender is a module that improves 3D occupancy learning by enforcing projective consistency.<n>Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure.
arXiv Detail & Related papers (2025-02-07T16:07:51Z) - PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM [105.01907579424362]
PanoSLAM is the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework.<n>For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video.
arXiv Detail & Related papers (2024-12-31T08:58:10Z) - GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
We propose a two-stage procedure that integrates dense and robust keypoint descriptors from the lightweight XFeat feature extractor into 3DGS.<n>In the second stage, the initial pose estimate is refined by minimizing the rendering-based photometric warp loss.<n> Benchmarking on widely used indoor and outdoor datasets demonstrates improvements over recent neural rendering-based localization methods.
arXiv Detail & Related papers (2024-09-24T23:18:32Z) - IG-SLAM: Instant Gaussian SLAM [6.228980850646457]
3D Gaussian Splatting has recently shown promising results as an alternative scene representation in SLAM systems.
We present IG-SLAM, a dense RGB-only SLAM system that employs robust Dense-SLAM methods for tracking and combines them with Gaussian Splatting.
We demonstrate competitive performance with state-of-the-art RGB-only SLAM systems while achieving faster operation speeds.
arXiv Detail & Related papers (2024-08-02T09:07:31Z) - Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM.
Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering.
Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z) - SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM [48.190398577764284]
SplaTAM is an approach to enable high-fidelity reconstruction from a single unposed RGB-D camera.
It employs a simple online tracking and mapping system tailored to the underlying Gaussian representation.
Experiments show that SplaTAM achieves up to 2x superior performance in camera pose estimation, map construction, and novel-view synthesis over existing methods.
arXiv Detail & Related papers (2023-12-04T18:53:24Z) - LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes [52.31402192831474]
Existing 3D scene generation models, however, limit the target scene to specific domain.
We propose LucidDreamer, a domain-free scene generation pipeline.
LucidDreamer produces highly-detailed Gaussian splats with no constraint on domain of the target scene.
arXiv Detail & Related papers (2023-11-22T13:27:34Z) - Pyramid Diffusion for Fine 3D Large Scene Generation [56.00726092690535]
Diffusion models have shown remarkable results in generating 2D images and small-scale 3D objects.
Their application to the synthesis of large-scale 3D scenes has been rarely explored.
We introduce a framework, the Pyramid Discrete Diffusion model (PDD), which employs scale-varied diffusion models to progressively generate high-quality outdoor scenes.
arXiv Detail & Related papers (2023-11-20T11:24:21Z) - SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving [98.74706005223685]
3D scene understanding plays a vital role in vision-based autonomous driving.
We propose a SurroundOcc method to predict the 3D occupancy with multi-camera images.
arXiv Detail & Related papers (2023-03-16T17:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.