Related papers: DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

URL: http://arxiv.org/abs/2001.05049v1
Date: Tue, 14 Jan 2020 21:08:51 GMT
Title: DeepFactors: Real-Time Probabilistic Dense Monocular SLAM
Authors: Jan Czarnowski, Tristan Laidlow, Ronald Clark and Andrew J. Davison
Abstract summary: We present a SLAM system that unifies methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry.
Score: 29.033778410908877
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system that unifies these methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors: photometric, reprojection and geometric, which we make use of within standard factor graph software. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry.

Related papers

Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions. Our approach uses diffusion models with a novel network architecture to learn surface point distributions. We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z)
MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization [29.713650915551632]
This letter introduces a novel framework for dense Visual Simultaneous Localization and Mapping based on Gaussian Splatting. We jointly optimize sparse visual odometry tracking and 3D Gaussian Splatting scene representation for the first time. The accuracy of our pose estimation surpasses existing methods and state-of-the-art.
arXiv Detail & Related papers (2024-05-10T04:42:21Z)
MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM. Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking. We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z)
Towards Scalable Multi-View Reconstruction of Geometry and Materials [27.660389147094715]
We propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes. The input are high-resolution RGBD images captured by a mobile, hand-held capture system with point lights for active illumination.
arXiv Detail & Related papers (2023-06-06T15:07:39Z)
Normal Transformer: Extracting Surface Geometry from LiDAR Points Enhanced by Visual Semantics [7.507853813361308]
We introduce a multi-modal technique that leverages 3D point clouds and 2D colour images obtained from LiDAR and camera sensors for surface normal estimation. We present a novel transformer-based neural network architecture that proficiently fuses visual semantic and 3D geometric information. It has been verified that the proposed model can learn from a simulated 3D environment that mimics a traffic scene.
arXiv Detail & Related papers (2022-11-19T03:55:09Z)
Multi-View Reconstruction using Signed Ray Distance Functions (SRDF) [22.75986869918975]
We investigate a new computational approach that builds on a novel shape representation that is volumetric. The shape energy associated to this representation evaluates 3D geometry given color images and does not need appearance prediction. In practice we propose an implicit shape representation, the SRDF, based on signed distances which we parameterize by depths along camera rays.
arXiv Detail & Related papers (2022-08-31T19:32:17Z)
RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object. We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z)
Combining Local and Global Pose Estimation for Precise Tracking of Similar Objects [2.861848675707602]
We present a multi-object 6D detection and tracking pipeline for potentially similar and non-textured objects. A new network architecture, trained solely with synthetic images, allows simultaneous pose estimation of multiple objects. We show how the system can be used in a real AR assistance application within the field of construction.
arXiv Detail & Related papers (2022-01-31T14:36:57Z)
Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction [11.215334675788952]
This paper presents a semantic planar SLAM system that improves pose estimation and mapping using cues from an instance planar segmentation network. While the mainstream approaches are using RGB-D sensors, employing a monocular camera with such a system still faces challenges such as robust data association and precise geometric model fitting.
arXiv Detail & Related papers (2021-08-09T18:16:08Z)
Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems. Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results. This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z)
Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision. Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z)
Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video. Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer. To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.