HDPV-SLAM: Hybrid Depth-augmented Panoramic Visual SLAM for Mobile
Mapping System with Tilted LiDAR and Panoramic Visual Camera
- URL: http://arxiv.org/abs/2301.11823v3
- Date: Thu, 22 Jun 2023 14:49:03 GMT
- Title: HDPV-SLAM: Hybrid Depth-augmented Panoramic Visual SLAM for Mobile
Mapping System with Tilted LiDAR and Panoramic Visual Camera
- Authors: Mostafa Ahmadi, Amin Alizadeh Naeini, Mohammad Moein Sheikholeslami,
Zahra Arjmandi, Yujia Zhang, and Gunho Sohn
- Abstract summary: This paper proposes a novel visual simultaneous localization and mapping (SLAM) system called Hybrid Depth-augmented Panoramic Visual SLAM (HDPV-SLAM)
It employs a panoramic camera and a tilted multi-beam LiDAR scanner to generate accurate and metrically-scaled trajectories.
It aims to solve the two major issues hindering the performance of similar SLAM systems.
- Score: 4.2421412410466575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel visual simultaneous localization and mapping
(SLAM) system called Hybrid Depth-augmented Panoramic Visual SLAM (HDPV-SLAM),
that employs a panoramic camera and a tilted multi-beam LiDAR scanner to
generate accurate and metrically-scaled trajectories. RGB-D SLAM was the design
basis for HDPV-SLAM, which added depth information to visual features. It aims
to solve the two major issues hindering the performance of similar SLAM
systems. The first obstacle is the sparseness of LiDAR depth, which makes it
difficult to correlate it with the extracted visual features of the RGB image.
A deep learning-based depth estimation module for iteratively densifying sparse
LiDAR depth was suggested to address this issue. The second issue pertains to
the difficulties in depth association caused by a lack of horizontal overlap
between the panoramic camera and the tilted LiDAR sensor. To surmount this
difficulty, we present a hybrid depth association module that optimally
combines depth information estimated by two independent procedures,
feature-based triangulation and depth estimation. During a phase of feature
tracking, this hybrid depth association module aims to maximize the use of more
accurate depth information between the triangulated depth with visual features
tracked and the deep learning-based corrected depth. We evaluated the efficacy
of HDPV-SLAM using the 18.95 km-long York University and Teledyne Optech (YUTO)
MMS dataset. The experimental results demonstrate that the two proposed modules
contribute substantially to the performance of HDPV-SLAM, which surpasses that
of the state-of-the-art (SOTA) SLAM systems.
Related papers
- Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions [58.88917836512819]
We propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints.
To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking.
Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset.
arXiv Detail & Related papers (2024-11-06T03:30:46Z) - DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation.
We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features.
We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z) - 360ORB-SLAM: A Visual SLAM System for Panoramic Images with Depth
Completion Network [18.23570356507258]
This paper proposes a 360ORB-SLAM system for panoramic images that combines with a depth completion network.
The proposed method achieves superior scale accuracy compared to existing monocular SLAM methods.
The integration of the depth completion network enhances system stability and mitigates the impact of dynamic elements on SLAM performance.
arXiv Detail & Related papers (2024-01-19T08:52:24Z) - Depth Completion with Multiple Balanced Bases and Confidence for Dense
Monocular SLAM [34.78726455243436]
We propose a novel method that integrates a light-weight depth completion network into a sparse SLAM system.
Specifically, we present a specifically optimized multi-basis depth completion network, called BBC-Net.
BBC-Net can predict multiple balanced bases and a confidence map from a monocular image with sparse points generated by off-the-shelf keypoint-based SLAM systems.
arXiv Detail & Related papers (2023-09-08T06:15:27Z) - Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile
Agents [43.137917788594926]
We propose a tightly-coupled LiDAR-visual SLAM based on geometric features.
The entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem.
Our system achieves more accurate and robust pose estimation compared to current state-of-the-art multi-modal methods.
arXiv Detail & Related papers (2023-07-15T10:06:43Z) - Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on
Disparity [17.98608948955211]
Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume.
We propose a disparity-based MVS method based on the epipolar disparity flow (E-flow), called DispMVS.
We show that DispMVS is not sensitive to the depth range and achieves state-of-the-art results with lower GPU memory.
arXiv Detail & Related papers (2022-11-30T11:05:02Z) - Unsupervised Visible-light Images Guided Cross-Spectrum Depth Estimation
from Dual-Modality Cameras [33.77748026254935]
Cross-spectrum depth estimation aims to provide a depth map in all illumination conditions with a pair of dual-spectrum images.
In this paper, we propose an unsupervised visible-light image guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth estimation framework.
Our method achieves better performance than the compared existing methods.
arXiv Detail & Related papers (2022-04-30T12:58:35Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Deep Two-View Structure-from-Motion Revisited [83.93809929963969]
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.
We propose to revisit the problem of deep two-view SfM by leveraging the well-posedness of the classic pipeline.
Our method consists of 1) an optical flow estimation network that predicts dense correspondences between two frames; 2) a normalized pose estimation module that computes relative camera poses from the 2D optical flow correspondences, and 3) a scale-invariant depth estimation network that leverages epipolar geometry to reduce the search space, refine the dense correspondences, and estimate relative depth maps.
arXiv Detail & Related papers (2021-04-01T15:31:20Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.