GRS-SLAM3R: Real-Time Dense SLAM with Gated Recurrent State
- URL: http://arxiv.org/abs/2509.23737v1
- Date: Sun, 28 Sep 2025 08:33:34 GMT
- Title: GRS-SLAM3R: Real-Time Dense SLAM with Gated Recurrent State
- Authors: Guole Shen, Tianchen Deng, Yanbo Wang, Yongtao Chen, Yilin Shen, Jiuming Liu, Jingchuan Wang,
- Abstract summary: We introduce GRS-SLAM3R, an end-to-end SLAM framework for dense scene reconstruction.<n>Our method supports sequentialized input and incrementally estimates metric-scale point clouds in the global coordinate.<n>Experiments on various datasets show that our framework achieves superior reconstruction accuracy while maintaining real-time performance.
- Score: 29.91962530945268
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: DUSt3R-based end-to-end scene reconstruction has recently shown promising results in dense visual SLAM. However, most existing methods only use image pairs to estimate pointmaps, overlooking spatial memory and global consistency.To this end, we introduce GRS-SLAM3R, an end-to-end SLAM framework for dense scene reconstruction and pose estimation from RGB images without any prior knowledge of the scene or camera parameters. Unlike existing DUSt3R-based frameworks, which operate on all image pairs and predict per-pair point maps in local coordinate frames, our method supports sequentialized input and incrementally estimates metric-scale point clouds in the global coordinate. In order to improve consistent spatial correlation, we use a latent state for spatial memory and design a transformer-based gated update module to reset and update the spatial memory that continuously aggregates and tracks relevant 3D information across frames. Furthermore, we partition the scene into submaps, apply local alignment within each submap, and register all submaps into a common world frame using relative constraints, producing a globally consistent map. Experiments on various datasets show that our framework achieves superior reconstruction accuracy while maintaining real-time performance.
Related papers
- TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction [57.46712611558817]
3D vision foundation models have shown strong generalization in reconstructing key 3D attributes from uncalibrated images through a single feed-forward pass.<n>Recent strategies align consecutive predictions by solving global transformation, yet our analysis reveals their fundamental limitations in assumption validity, local alignment scope, and robustness under noisy geometry.<n>We propose a higher-DOF and long-term alignment framework based on Thin Plate Spline, leveraging globally propagated control points to correct spatially varying inconsistencies.
arXiv Detail & Related papers (2025-12-02T02:22:20Z) - SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors [80.51557267896938]
SING3R-SLAM is a globally consistent and compact Gaussian-based dense RGB SLAM framework.<n>We show that SING3R-SLAM achieves state-of-the-art tracking, 3D reconstruction, and novel view rendering, resulting in over 12% improvement in tracking and producing finer, more detailed geometry.
arXiv Detail & Related papers (2025-11-21T12:40:55Z) - Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory [72.75478398447396]
We propose Point3R, an online framework targeting dense streaming 3D reconstruction.<n>To be specific, we maintain an explicit spatial pointer memory directly associated with the 3D structure of the current scene.<n>Our method achieves competitive or state-of-the-art performance on various tasks with low training costs.
arXiv Detail & Related papers (2025-07-03T17:59:56Z) - St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World [106.91539872943864]
St4RTrack is a framework that simultaneously reconstructs and tracks dynamic video content in a world coordinate frame from RGB inputs.<n>We predict both pointmaps at the same moment, in the same world, capturing both static and dynamic scene geometry.<n>We establish a new extensive benchmark for world-frame reconstruction and tracking, demonstrating the effectiveness and efficiency of our unified, data-driven framework.
arXiv Detail & Related papers (2025-04-17T17:55:58Z) - SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos [33.57444419305241]
SLAM3R is a novel system for real-time, high-quality, dense 3D reconstruction using RGB videos.<n>It seamlessly integrates local 3D reconstruction and global coordinate registration through feed-forward neural networks.<n>It achieves state-of-the-art reconstruction accuracy and completeness while maintaining real-time performance at 20+ FPS.
arXiv Detail & Related papers (2024-12-12T16:08:03Z) - 3D Reconstruction with Spatial Memory [9.282647987510499]
We present Spann3R, a novel approach for dense 3D reconstruction from ordered or unordered image collections.
Built on the DUSt3R paradigm, Spann3R uses a transformer-based architecture to directly regress pointmaps from images without any prior knowledge of the scene or camera parameters.
arXiv Detail & Related papers (2024-08-28T18:01:00Z) - Loopy-SLAM: Dense Neural SLAM with Loop Closures [53.11936461015725]
We introduce Loopy-SLAM that globally optimize poses and the dense 3D model.
We use frame-to-model tracking using a data-driven point-based submap generation method and trigger loop closures online by performing global place recognition.
Evaluation on the synthetic Replica and real-world TUM-RGBD and ScanNet datasets demonstrate competitive or superior performance in tracking, mapping, and rendering accuracy when compared to existing dense neural RGBD SLAM methods.
arXiv Detail & Related papers (2024-02-14T18:18:32Z) - Anyview: Generalizable Indoor 3D Object Detection with Variable Frames [60.48134767838629]
We present a novel 3D detection framework named AnyView for our practical applications.<n>Our method achieves both great generalizability and high detection accuracy with a simple and clean architecture.
arXiv Detail & Related papers (2023-10-09T02:15:45Z) - GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction [45.49960166785063]
GO-SLAM is a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time.
Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy.
arXiv Detail & Related papers (2023-09-05T17:59:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.