SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos
- URL: http://arxiv.org/abs/2412.09401v2
- Date: Thu, 19 Dec 2024 12:23:39 GMT
- Title: SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos
- Authors: Yuzheng Liu, Siyan Dong, Shuzhe Wang, Yanchao Yang, Qingnan Fan, Baoquan Chen,
- Abstract summary: SLAM3R is a novel and effective monocular RGB SLAM system for real-time and high-quality dense 3D reconstruction.
Unlike traditional pose optimization-based methods, SLAM3R directly regresses 3D pointmaps from RGB images in each window.
Experiments consistently show that SLAM3R achieves state-of-the-art reconstruction accuracy and completeness while maintaining real-time performance at 20+ FPS.
- Score: 32.6924827171619
- License:
- Abstract: In this paper, we introduce SLAM3R, a novel and effective monocular RGB SLAM system for real-time and high-quality dense 3D reconstruction. SLAM3R provides an end-to-end solution by seamlessly integrating local 3D reconstruction and global coordinate registration through feed-forward neural networks. Given an input video, the system first converts it into overlapping clips using a sliding window mechanism. Unlike traditional pose optimization-based methods, SLAM3R directly regresses 3D pointmaps from RGB images in each window and progressively aligns and deforms these local pointmaps to create a globally consistent scene reconstruction - all without explicitly solving any camera parameters. Experiments across datasets consistently show that SLAM3R achieves state-of-the-art reconstruction accuracy and completeness while maintaining real-time performance at 20+ FPS. Code and weights at: https://github.com/PKU-VCL-3DV/SLAM3R.
Related papers
- PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM [105.01907579424362]
PanoSLAM is the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework.
For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video.
arXiv Detail & Related papers (2024-12-31T08:58:10Z) - HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction [38.47566815670662]
HI-SLAM2 is a geometry-aware Gaussian SLAM system that achieves fast and accurate monocular scene reconstruction using only RGB input.
We demonstrate significant improvements over existing Neural SLAM methods and even surpass RGB-D-based methods in both reconstruction and rendering quality.
arXiv Detail & Related papers (2024-11-27T01:39:21Z) - Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians [87.48403838439391]
3D Splatting has emerged as a powerful representation of geometry and appearance for RGB-only dense Simultaneous SLAM.
We propose the first RGB-only SLAM system with a dense 3D Gaussian map representation.
Our experiments on the Replica, TUM-RGBD, and ScanNet datasets indicate the effectiveness of globally optimized 3D Gaussians.
arXiv Detail & Related papers (2024-05-26T12:26:54Z) - GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM [53.6402869027093]
We propose an efficient RGB-only dense SLAM system using a flexible neural point cloud representation scene.
We also introduce a novel DSPO layer for bundle adjustment which optimize the pose and depth of implicits along with the scale of the monocular depth.
arXiv Detail & Related papers (2024-03-28T16:32:06Z) - Loopy-SLAM: Dense Neural SLAM with Loop Closures [53.11936461015725]
We introduce Loopy-SLAM that globally optimize poses and the dense 3D model.
We use frame-to-model tracking using a data-driven point-based submap generation method and trigger loop closures online by performing global place recognition.
Evaluation on the synthetic Replica and real-world TUM-RGBD and ScanNet datasets demonstrate competitive or superior performance in tracking, mapping, and rendering accuracy when compared to existing dense neural RGBD SLAM methods.
arXiv Detail & Related papers (2024-02-14T18:18:32Z) - GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction [45.49960166785063]
GO-SLAM is a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time.
Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy.
arXiv Detail & Related papers (2023-09-05T17:59:58Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of
Signed Distance Fields [2.0625936401496237]
ESLAM reads RGB-D frames with unknown camera poses in a sequential manner and incrementally reconstructs the scene representation.
ESLAM improves the accuracy of 3D reconstruction and camera localization of state-of-the-art dense visual SLAM methods by more than 50%.
arXiv Detail & Related papers (2022-11-21T18:25:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.