Improved Real-Time Monocular SLAM Using Semantic Segmentation on
Selective Frames
- URL: http://arxiv.org/abs/2105.00114v1
- Date: Fri, 30 Apr 2021 22:34:45 GMT
- Title: Improved Real-Time Monocular SLAM Using Semantic Segmentation on
Selective Frames
- Authors: Jinkyu Lee, Muhyun Back, Sung Soo Hwang and Il Yong Chun
- Abstract summary: monocular simultaneous localization and mapping (SLAM) is emerging in advanced driver assistance systems and autonomous driving.
This paper proposes an improved real-time monocular SLAM using deep learning-based semantic segmentation.
Experiments with six video sequences demonstrate that the proposed monocular SLAM system achieves significantly more accurate trajectory tracking accuracy.
- Score: 15.455647477995312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular simultaneous localization and mapping (SLAM) is emerging in
advanced driver assistance systems and autonomous driving, because a single
camera is cheap and easy to install. Conventional monocular SLAM has two major
challenges leading inaccurate localization and mapping. First, it is
challenging to estimate scales in localization and mapping. Second,
conventional monocular SLAM uses inappropriate mapping factors such as dynamic
objects and low-parallax ares in mapping. This paper proposes an improved
real-time monocular SLAM that resolves the aforementioned challenges by
efficiently using deep learning-based semantic segmentation. To achieve the
real-time execution of the proposed method, we apply semantic segmentation only
to downsampled keyframes in parallel with mapping processes. In addition, the
proposed method corrects scales of camera poses and three-dimensional (3D)
points, using estimated ground plane from road-labeled 3D points and the real
camera height. The proposed method also removes inappropriate corner features
labeled as moving objects and low parallax areas. Experiments with six video
sequences demonstrate that the proposed monocular SLAM system achieves
significantly more accurate trajectory tracking accuracy compared to
state-of-the-art monocular SLAM and comparable trajectory tracking accuracy
compared to state-of-the-art stereo SLAM.
Related papers
- MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction [2.3630527334737104]
MoD-SLAM is the first monocular NeRF-based dense mapping method that allows 3D reconstruction in real-time in unbounded scenes.
By introducing a robust depth loss term into the tracking process, our SLAM system achieves more precise pose estimation in large-scale scenes.
Our experiments on two standard datasets show that MoD-SLAM achieves competitive performance, improving the accuracy of the 3D reconstruction and localization by up to 30% and 15% respectively.
arXiv Detail & Related papers (2024-02-06T07:07:33Z) - Cross-Modal Semi-Dense 6-DoF Tracking of an Event Camera in Challenging
Conditions [29.608665442108727]
Event-based cameras are bio-inspired visual sensors that perform well in HDR conditions and have high temporal resolution.
The present work demonstrates the feasibility of purely event-based tracking if an alternative sensor is permitted for mapping.
The method relies on geometric 3D-2D registration of semi-dense maps and events, and achieves highly reliable and accurate cross-modal tracking results.
arXiv Detail & Related papers (2024-01-16T01:48:45Z) - Gaussian Splatting SLAM [16.3858380078553]
We present the first application of 3D Gaussian Splatting in monocular SLAM.
Our method runs live at 3fps, unifying the required representation for accurate tracking, mapping, and high-quality rendering.
Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera.
arXiv Detail & Related papers (2023-12-11T18:19:04Z) - Comparative Study of Visual SLAM-Based Mobile Robot Localization Using
Fiducial Markers [4.918853205874711]
This paper presents a comparative study of three modes for mobile robot localization based on visual SLAM using fiducial markers.
The reason for comparing the SLAM-based approaches is because previous work has shown their superior performance over feature-only methods.
Hardware experiments show consistent trajectory error levels across the three modes, with the localization mode exhibiting the shortest runtime among them.
arXiv Detail & Related papers (2023-09-08T17:05:24Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - Self-Supervised Multi-Frame Monocular Scene Flow [61.588808225321735]
We introduce a multi-frame monocular scene flow network based on self-supervised learning.
We observe state-of-the-art accuracy among monocular scene flow methods based on self-supervised learning.
arXiv Detail & Related papers (2021-05-05T17:49:55Z) - Robust On-Manifold Optimization for Uncooperative Space Relative
Navigation with a Single Camera [4.129225533930966]
An innovative model-based approach is demonstrated to estimate the six-dimensional pose of a target object relative to the chaser spacecraft using solely a monocular setup.
It is validated on realistic synthetic and laboratory datasets of a rendezvous trajectory with the complex spacecraft Envisat.
arXiv Detail & Related papers (2020-05-14T16:23:04Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.