MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction
- URL: http://arxiv.org/abs/2402.03762v5
- Date: Fri, 8 Mar 2024 18:42:34 GMT
- Title: MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction
- Authors: Heng Zhou, Zhetao Guo, Shuhong Liu, Lechen Zhang, Qihao Wang, Yuxiang
Ren, Mingrui Li
- Abstract summary: MoD-SLAM is the first monocular NeRF-based dense mapping method that allows 3D reconstruction in real-time in unbounded scenes.
By introducing a robust depth loss term into the tracking process, our SLAM system achieves more precise pose estimation in large-scale scenes.
Our experiments on two standard datasets show that MoD-SLAM achieves competitive performance, improving the accuracy of the 3D reconstruction and localization by up to 30% and 15% respectively.
- Score: 2.3630527334737104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular SLAM has received a lot of attention due to its simple RGB inputs
and the lifting of complex sensor constraints. However, existing monocular SLAM
systems are designed for bounded scenes, restricting the applicability of SLAM
systems. To address this limitation, we propose MoD-SLAM, the first monocular
NeRF-based dense mapping method that allows 3D reconstruction in real-time in
unbounded scenes. Specifically, we introduce a Gaussian-based unbounded scene
representation approach to solve the challenge of mapping scenes without
boundaries. This strategy is essential to extend the SLAM application.
Moreover, a depth estimation module in the front-end is designed to extract
accurate priori depth values to supervise mapping and tracking processes. By
introducing a robust depth loss term into the tracking process, our SLAM system
achieves more precise pose estimation in large-scale scenes. Our experiments on
two standard datasets show that MoD-SLAM achieves competitive performance,
improving the accuracy of the 3D reconstruction and localization by up to 30%
and 15% respectively compared with existing state-of-the-art monocular SLAM
systems.
Related papers
- IG-SLAM: Instant Gaussian SLAM [6.228980850646457]
3D Gaussian Splatting has recently shown promising results as an alternative scene representation in SLAM systems.
We present IG-SLAM, a dense RGB-only SLAM system that employs robust Dense-SLAM methods for tracking and combines them with Gaussian Splatting.
We demonstrate competitive performance with state-of-the-art RGB-only SLAM systems while achieving faster operation speeds.
arXiv Detail & Related papers (2024-08-02T09:07:31Z) - MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - Q-SLAM: Quadric Representations for Monocular SLAM [89.05457684629621]
Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries.
Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise.
We propose a novel approach that reimagines volumetric representations through the lens of quadric forms.
arXiv Detail & Related papers (2024-03-12T23:27:30Z) - Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM.
Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering.
Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z) - SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM [48.190398577764284]
SplaTAM is an approach to enable high-fidelity reconstruction from a single unposed RGB-D camera.
It employs a simple online tracking and mapping system tailored to the underlying Gaussian representation.
Experiments show that SplaTAM achieves up to 2x superior performance in camera pose estimation, map construction, and novel-view synthesis over existing methods.
arXiv Detail & Related papers (2023-12-04T18:53:24Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of
Signed Distance Fields [2.0625936401496237]
ESLAM reads RGB-D frames with unknown camera poses in a sequential manner and incrementally reconstructs the scene representation.
ESLAM improves the accuracy of 3D reconstruction and camera localization of state-of-the-art dense visual SLAM methods by more than 50%.
arXiv Detail & Related papers (2022-11-21T18:25:14Z) - NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [112.6093688226293]
NICE-SLAM is a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation.
Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust.
arXiv Detail & Related papers (2021-12-22T18:45:44Z) - Improved Real-Time Monocular SLAM Using Semantic Segmentation on
Selective Frames [15.455647477995312]
monocular simultaneous localization and mapping (SLAM) is emerging in advanced driver assistance systems and autonomous driving.
This paper proposes an improved real-time monocular SLAM using deep learning-based semantic segmentation.
Experiments with six video sequences demonstrate that the proposed monocular SLAM system achieves significantly more accurate trajectory tracking accuracy.
arXiv Detail & Related papers (2021-04-30T22:34:45Z) - Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction [72.30870535815258]
CNNs for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment.
We propose a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM.
On the other hand, the bundle-adjusted 3D scene structures and camera poses from the more principled geometric SLAM are injected back into the depth network through novel wide baseline losses.
arXiv Detail & Related papers (2020-04-22T16:31:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.