Q-SLAM: Quadric Representations for Monocular SLAM
- URL: http://arxiv.org/abs/2403.08125v1
- Date: Tue, 12 Mar 2024 23:27:30 GMT
- Title: Q-SLAM: Quadric Representations for Monocular SLAM
- Authors: Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang,
Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan
- Abstract summary: Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries.
Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise.
We propose a novel approach that reimagines volumetric representations through the lens of quadric forms.
- Score: 89.05457684629621
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Monocular SLAM has long grappled with the challenge of accurately modeling 3D
geometries. Recent advances in Neural Radiance Fields (NeRF)-based monocular
SLAM have shown promise, yet these methods typically focus on novel view
synthesis rather than precise 3D geometry modeling. This focus results in a
significant disconnect between NeRF applications, i.e., novel-view synthesis
and the requirements of SLAM. We identify that the gap results from the
volumetric representations used in NeRF, which are often dense and noisy. In
this study, we propose a novel approach that reimagines volumetric
representations through the lens of quadric forms. We posit that most scene
components can be effectively represented as quadric planes. Leveraging this
assumption, we reshape the volumetric representations with million of cubes by
several quadric planes, which leads to more accurate and efficient modeling of
3D scenes in SLAM contexts. Our method involves two key steps: First, we use
the quadric assumption to enhance coarse depth estimations obtained from
tracking modules, e.g., Droid-SLAM. This step alone significantly improves
depth estimation accuracy. Second, in the subsequent mapping phase, we diverge
from previous NeRF-based SLAM methods that distribute sampling points across
the entire volume space. Instead, we concentrate sampling points around quadric
planes and aggregate them using a novel quadric-decomposed Transformer.
Additionally, we introduce an end-to-end joint optimization strategy that
synchronizes pose estimation with 3D reconstruction.
Related papers
- PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices.
Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z) - Compact 3D Gaussian Splatting For Dense Visual SLAM [32.37035997240123]
We propose a compact 3D Gaussian Splatting SLAM system that reduces the number and the parameter size of Gaussian ellipsoids.
A sliding window-based masking strategy is first proposed to reduce the redundant ellipsoids.
Our method achieves faster training and rendering speed while maintaining the state-of-the-art (SOTA) quality of the scene representation.
arXiv Detail & Related papers (2024-03-17T15:41:35Z) - MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction [2.3630527334737104]
MoD-SLAM is the first monocular NeRF-based dense mapping method that allows 3D reconstruction in real-time in unbounded scenes.
By introducing a robust depth loss term into the tracking process, our SLAM system achieves more precise pose estimation in large-scale scenes.
Our experiments on two standard datasets show that MoD-SLAM achieves competitive performance, improving the accuracy of the 3D reconstruction and localization by up to 30% and 15% respectively.
arXiv Detail & Related papers (2024-02-06T07:07:33Z) - Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM.
Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering.
Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - Learning Neural Radiance Fields from Multi-View Geometry [1.1011268090482573]
We present a framework, called MVG-NeRF, that combines Multi-View Geometry algorithms and Neural Radiance Fields (NeRF) for image-based 3D reconstruction.
NeRF has revolutionized the field of implicit 3D representations, mainly due to a differentiable rendering formulation that enables high-quality and geometry-aware novel view synthesis.
arXiv Detail & Related papers (2022-10-24T08:53:35Z) - Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans.
Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art.
LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z) - A Model for Multi-View Residual Covariances based on Perspective
Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups.
We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z) - Volume Rendering of Neural Implicit Surfaces [57.802056954935495]
This paper aims to improve geometry representation and reconstruction in neural volume rendering.
We achieve that by modeling the volume density as a function of the geometry.
Applying this new density representation to challenging scene multiview datasets produced high quality geometry reconstructions.
arXiv Detail & Related papers (2021-06-22T20:23:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.