MCGS-SLAM: A Multi-Camera SLAM Framework Using Gaussian Splatting for High-Fidelity Mapping
- URL: http://arxiv.org/abs/2509.14191v1
- Date: Wed, 17 Sep 2025 17:27:53 GMT
- Title: MCGS-SLAM: A Multi-Camera SLAM Framework Using Gaussian Splatting for High-Fidelity Mapping
- Authors: Zhihao Cao, Hanyu Wu, Li Wa Tang, Zizhou Luo, Zihan Zhu, Wei Zhang, Marc Pollefeys, Martin R. Oswald,
- Abstract summary: We present MCGS-SLAM, the first purely RGB-based multi-camera SLAM system built on 3D Gaussian Splatting (3DGS)<n>A multi-camera bundle adjustment (MCBA) jointly refines poses and depths via dense photometric and geometric residuals, while a scale consistency module enforces metric alignment across views.<n>Experiments on synthetic and real-world datasets show that MCGS-SLAM consistently yields accurate trajectories and photorealistic reconstructions.
- Score: 52.99503784067417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent progress in dense SLAM has primarily targeted monocular setups, often at the expense of robustness and geometric coverage. We present MCGS-SLAM, the first purely RGB-based multi-camera SLAM system built on 3D Gaussian Splatting (3DGS). Unlike prior methods relying on sparse maps or inertial data, MCGS-SLAM fuses dense RGB inputs from multiple viewpoints into a unified, continuously optimized Gaussian map. A multi-camera bundle adjustment (MCBA) jointly refines poses and depths via dense photometric and geometric residuals, while a scale consistency module enforces metric alignment across views using low-rank priors. The system supports RGB input and maintains real-time performance at large scale. Experiments on synthetic and real-world datasets show that MCGS-SLAM consistently yields accurate trajectories and photorealistic reconstructions, usually outperforming monocular baselines. Notably, the wide field of view from multi-camera input enables reconstruction of side-view regions that monocular setups miss, critical for safe autonomous operation. These results highlight the promise of multi-camera Gaussian Splatting SLAM for high-fidelity mapping in robotics and autonomous driving.
Related papers
- SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors [80.51557267896938]
SING3R-SLAM is a globally consistent and compact Gaussian-based dense RGB SLAM framework.<n>We show that SING3R-SLAM achieves state-of-the-art tracking, 3D reconstruction, and novel view rendering, resulting in over 12% improvement in tracking and producing finer, more detailed geometry.
arXiv Detail & Related papers (2025-11-21T12:40:55Z) - TVG-SLAM: Robust Gaussian Splatting SLAM with Tri-view Geometric Constraints [22.121665995381324]
TVG-SLAM is a robust RGB-only 3DGS SLAM system that leverages a novel tri-view geometry paradigm to ensure consistent tracking and high-quality mapping.<n>Our method improves tracking robustness, reducing the average Absolute Trajectory Error (ATE) by 69.0% while achieving state-of-the-art rendering quality.
arXiv Detail & Related papers (2025-06-29T12:31:05Z) - Large-Scale Gaussian Splatting SLAM [21.253966057320383]
This paper introduces a large-scale 3DGS-based visual SLAM with stereo cameras, termed LSG-SLAM.<n>With extensive evaluations on the EuRoc and KITTI datasets, LSG-SLAM achieves superior performance over existing Neural, 3DGS-based, and even traditional approaches.
arXiv Detail & Related papers (2025-05-15T03:00:32Z) - GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats [20.98774763235796]
We introduce GigaSLAM, the first RGB NeRF / 3DGS-based SLAM framework for large-scale, unbounded outdoor environments.<n>Our approach employs a hierarchical sparse voxel map representation, where Gaussians are decoded by neural networks at multiple levels of detail.<n>GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks.
arXiv Detail & Related papers (2025-03-11T06:05:15Z) - HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction [38.47566815670662]
HI-SLAM2 is a geometry-aware Gaussian SLAM system that achieves fast and accurate monocular scene reconstruction using only RGB input.<n>We demonstrate significant improvements over existing Neural SLAM methods and even surpass RGB-D-based methods in both reconstruction and rendering quality.
arXiv Detail & Related papers (2024-11-27T01:39:21Z) - MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM.
Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering.
Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z) - SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM [48.190398577764284]
SplaTAM is an approach to enable high-fidelity reconstruction from a single unposed RGB-D camera.
It employs a simple online tracking and mapping system tailored to the underlying Gaussian representation.
Experiments show that SplaTAM achieves up to 2x superior performance in camera pose estimation, map construction, and novel-view synthesis over existing methods.
arXiv Detail & Related papers (2023-12-04T18:53:24Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction [72.30870535815258]
CNNs for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment.
We propose a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM.
On the other hand, the bundle-adjusted 3D scene structures and camera poses from the more principled geometric SLAM are injected back into the depth network through novel wide baseline losses.
arXiv Detail & Related papers (2020-04-22T16:31:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.