Related papers: VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

URL: http://arxiv.org/abs/2505.18992v1
Date: Sun, 25 May 2025 06:27:29 GMT
Title: VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes
Authors: Tianchen Deng, Wenhua Wu, Junjie He, Yue Pan, Xirui Jiang, Shenghai Yuan, Danwei Wang, Hesheng Wang, Weidong Chen,
Abstract summary: VPGS-SLAM is the first 3DGS-based large-scale RGBD SLAM framework for both indoor and outdoor scenarios.<n>We design a novel voxel-based progressive 3D Gaussian mapping method with multiple submaps for compact and accurate scene representation.<n>In addition, we propose a 2D-3D fusion camera tracking method to achieve robust and accurate camera tracking in both indoor and outdoor large-scale scenes.
Score: 26.06908154350295
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: 3D Gaussian Splatting has recently shown promising results in dense visual SLAM. However, existing 3DGS-based SLAM methods are all constrained to small-room scenarios and struggle with memory explosion in large-scale scenes and long sequences. To this end, we propose VPGS-SLAM, the first 3DGS-based large-scale RGBD SLAM framework for both indoor and outdoor scenarios. We design a novel voxel-based progressive 3D Gaussian mapping method with multiple submaps for compact and accurate scene representation in large-scale and long-sequence scenes. This allows us to scale up to arbitrary scenes and improves robustness (even under pose drifts). In addition, we propose a 2D-3D fusion camera tracking method to achieve robust and accurate camera tracking in both indoor and outdoor large-scale scenes. Furthermore, we design a 2D-3D Gaussian loop closure method to eliminate pose drift. We further propose a submap fusion method with online distillation to achieve global consistency in large-scale scenes when detecting a loop. Experiments on various indoor and outdoor datasets demonstrate the superiority and generalizability of the proposed framework. The code will be open source on https://github.com/dtc111111/vpgs-slam.

Related papers

Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps [13.325879149065008]
3D Gaussian Splatting (3DGS) has become a popular solution in SLAM due to its high-fidelity synthesis and real-time novel view performance.<n>Previous 3DGS SLAM methods employ a differentiable rendering pipeline for tracking, lack geometric priors in outdoor scenes.<n>We propose a robust RGB-only outdoor 3DGS SLAM method: S3PO-GS. Technically, we establish a self-consistent tracking module anchored in the 3DGS pointmap, which avoids cumulative scale drift and achieves more precise and robust tracking with fewer iterations.
arXiv Detail & Related papers (2025-07-04T17:56:43Z)
MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation [51.07118703442774]
Existing NeRF-based multi-agent SLAM frameworks cannot meet the constraints of communication bandwidth.<n>We propose the first distributed multi-agent collaborative neural SLAM framework with hybrid scene representation.<n>A novel triplane-grid joint scene representation method is proposed to improve scene reconstruction.<n>A novel intra-to-inter loop closure method is designed to achieve local (single-agent) and global (multi-agent) consistency.
arXiv Detail & Related papers (2025-06-23T14:22:29Z)
GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats [20.98774763235796]
We introduce GigaSLAM, the first RGB NeRF / 3DGS-based SLAM framework for large-scale, unbounded outdoor environments.<n>Our approach employs a hierarchical sparse voxel map representation, where Gaussians are decoded by neural networks at multiple levels of detail.<n>GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks.
arXiv Detail & Related papers (2025-03-11T06:05:15Z)
RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes [12.150995604820443]
3D Gaussian Splatting (3DGS) has become a popular solution in SLAM, as it can produce high-fidelity novel views.<n>Previous GS-based methods primarily target indoor scenes and rely on RGB-D sensors or pre-trained depth estimation models.<n>We propose a RGB-only gaussian splatting SLAM method for unbounded outdoor scenes--OpenGS-SLAM.
arXiv Detail & Related papers (2025-02-21T18:02:31Z)
GaussRender: Learning 3D Occupancy with Gaussian Rendering [86.89653628311565]
GaussRender is a module that improves 3D occupancy learning by enforcing projective consistency.<n>Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure.
arXiv Detail & Related papers (2025-02-07T16:07:51Z)
PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM [105.01907579424362]
PanoSLAM is the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework.<n>For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video.
arXiv Detail & Related papers (2024-12-31T08:58:10Z)
GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
We propose a two-stage procedure that integrates dense and robust keypoint descriptors from the lightweight XFeat feature extractor into 3DGS.<n>In the second stage, the initial pose estimate is refined by minimizing the rendering-based photometric warp loss.<n> Benchmarking on widely used indoor and outdoor datasets demonstrates improvements over recent neural rendering-based localization methods.
arXiv Detail & Related papers (2024-09-24T23:18:32Z)
IG-SLAM: Instant Gaussian SLAM [6.228980850646457]
3D Gaussian Splatting has recently shown promising results as an alternative scene representation in SLAM systems. We present IG-SLAM, a dense RGB-only SLAM system that employs robust Dense-SLAM methods for tracking and combines them with Gaussian Splatting. We demonstrate competitive performance with state-of-the-art RGB-only SLAM systems while achieving faster operation speeds.
arXiv Detail & Related papers (2024-08-02T09:07:31Z)
Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM. Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering. Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z)
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM [48.190398577764284]
SplaTAM is an approach to enable high-fidelity reconstruction from a single unposed RGB-D camera. It employs a simple online tracking and mapping system tailored to the underlying Gaussian representation. Experiments show that SplaTAM achieves up to 2x superior performance in camera pose estimation, map construction, and novel-view synthesis over existing methods.
arXiv Detail & Related papers (2023-12-04T18:53:24Z)
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes [52.31402192831474]
Existing 3D scene generation models, however, limit the target scene to specific domain. We propose LucidDreamer, a domain-free scene generation pipeline. LucidDreamer produces highly-detailed Gaussian splats with no constraint on domain of the target scene.
arXiv Detail & Related papers (2023-11-22T13:27:34Z)
Pyramid Diffusion for Fine 3D Large Scene Generation [56.00726092690535]
Diffusion models have shown remarkable results in generating 2D images and small-scale 3D objects. Their application to the synthesis of large-scale 3D scenes has been rarely explored. We introduce a framework, the Pyramid Discrete Diffusion model (PDD), which employs scale-varied diffusion models to progressively generate high-quality outdoor scenes.
arXiv Detail & Related papers (2023-11-20T11:24:21Z)
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving [98.74706005223685]
3D scene understanding plays a vital role in vision-based autonomous driving. We propose a SurroundOcc method to predict the 3D occupancy with multi-camera images.
arXiv Detail & Related papers (2023-03-16T17:59:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.