MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
- URL: http://arxiv.org/abs/2412.12392v1
- Date: Mon, 16 Dec 2024 23:00:05 GMT
- Title: MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
- Authors: Riku Murai, Eric Dexheimer, Andrew J. Davison,
- Abstract summary: We present a real-time monocular dense SLAM system designed bottom-up from MASt3R.
Our system is robust on in-the-wild video sequences despite making no assumption on a fixed or parametric camera model.
- Score: 15.764342766592808
- License:
- Abstract: We present a real-time monocular dense SLAM system designed bottom-up from MASt3R, a two-view 3D reconstruction and matching prior. Equipped with this strong prior, our system is robust on in-the-wild video sequences despite making no assumption on a fixed or parametric camera model beyond a unique camera centre. We introduce efficient methods for pointmap matching, camera tracking and local fusion, graph construction and loop closure, and second-order global optimisation. With known calibration, a simple modification to the system achieves state-of-the-art performance across various benchmarks. Altogether, we propose a plug-and-play monocular SLAM system capable of producing globally-consistent poses and dense geometry while operating at 15 FPS.
Related papers
- Advancing Dense Endoscopic Reconstruction with Gaussian Splatting-driven Surface Normal-aware Tracking and Mapping [12.027762278121052]
Endo-2DTAM is a real-time endoscopic SLAM system with 2D Gaussian Splatting (2DGS)
Our robust tracking module combines point-to-point and point-to-plane distance metrics.
Our mapping module utilizes normal consistency and depth distortion to enhance surface reconstruction quality.
arXiv Detail & Related papers (2025-01-31T17:15:34Z) - Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera [49.82535393220003]
Dyn-HaMR is the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild.
We show that our approach significantly outperforms state-of-the-art methods in terms of 4D global mesh recovery.
This establishes a new benchmark for hand motion reconstruction from monocular video with moving cameras.
arXiv Detail & Related papers (2024-12-17T12:43:10Z) - SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos [32.6924827171619]
SLAM3R is a novel and effective monocular RGB SLAM system for real-time and high-quality dense 3D reconstruction.
Unlike traditional pose optimization-based methods, SLAM3R directly regresses 3D pointmaps from RGB images in each window.
Experiments consistently show that SLAM3R achieves state-of-the-art reconstruction accuracy and completeness while maintaining real-time performance at 20+ FPS.
arXiv Detail & Related papers (2024-12-12T16:08:03Z) - DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction [65.46359561104867]
We target the challenge of online 2D and 3D point tracking from unposed monocular camera input.
We leverage 3D Gaussian splatting to reconstruct dynamic scenes in an online fashion.
We aim to inspire the community to advance online point tracking and reconstruction, expanding the applicability to diverse real-world scenarios.
arXiv Detail & Related papers (2024-09-03T17:58:03Z) - MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM.
Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering.
Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z) - GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction [45.49960166785063]
GO-SLAM is a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time.
Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy.
arXiv Detail & Related papers (2023-09-05T17:59:58Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - DSP-SLAM: Object Oriented SLAM with Deep Shape Priors [16.867669408751507]
We propose an object-oriented SLAM system that builds a rich and accurate joint map of dense 3D models for foreground objects.
DSP-SLAM takes as input the 3D point cloud reconstructed by a feature-based SLAM system.
Our evaluation shows improvements in object pose and shape reconstruction with respect to recent deep prior-based reconstruction methods.
arXiv Detail & Related papers (2021-08-21T10:00:12Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z) - Redesigning SLAM for Arbitrary Multi-Camera Systems [51.81798192085111]
Adding more cameras to SLAM systems improves robustness and accuracy but complicates the design of the visual front-end significantly.
In this work, we aim at an adaptive SLAM system that works for arbitrary multi-camera setups.
We adapt a state-of-the-art visual-inertial odometry with these modifications, and experimental results show that the modified pipeline can adapt to a wide range of camera setups.
arXiv Detail & Related papers (2020-03-04T11:44:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.