Map-merging Algorithms for Visual SLAM: Feasibility Study and Empirical
Evaluation
- URL: http://arxiv.org/abs/2009.05819v1
- Date: Sat, 12 Sep 2020 16:15:16 GMT
- Title: Map-merging Algorithms for Visual SLAM: Feasibility Study and Empirical
Evaluation
- Authors: Andrey Bokovoy, Kirill Muraviev and Konstantin Yakovlev
- Abstract summary: State-of-the-art vSLAM algorithms are capable of constructing accurate-enough maps that enable a mobile robot to autonomously navigate an unknown environment.
This problem asks whether different vSLAM maps can be merged into a consistent single representation.
We examine the existing 2D and 3D map-merging algorithms and conduct an extensive empirical evaluation in realistic simulated environment.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous localization and mapping, especially the one relying solely on
video data (vSLAM), is a challenging problem that has been extensively studied
in robotics and computer vision. State-of-the-art vSLAM algorithms are capable
of constructing accurate-enough maps that enable a mobile robot to autonomously
navigate an unknown environment. In this work, we are interested in an
important problem related to vSLAM, i.e. map merging, that might appear in
various practically important scenarios, e.g. in a multi-robot coverage
scenario. This problem asks whether different vSLAM maps can be merged into a
consistent single representation. We examine the existing 2D and 3D map-merging
algorithms and conduct an extensive empirical evaluation in realistic simulated
environment (Habitat). Both qualitative and quantitative comparison is carried
out and the obtained results are reported and analyzed.
Related papers
- MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM [23.318966306555915]
Simultaneous localization and mapping (SLAM) systems are widely used in computer vision, with applications in augmented reality, robotics, and autonomous driving.
Recent work has addressed this problem using a distributed neural scene representation.
We propose a rigidly deformable 3D Gaussian-based scene representation that dramatically speeds up the system.
We evaluate MAGiC-SLAM on synthetic and real-world datasets and find it more accurate and faster than the state of the art.
arXiv Detail & Related papers (2024-11-25T08:34:01Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - Active Visual Localization for Multi-Agent Collaboration: A Data-Driven Approach [47.373245682678515]
This work investigates how active visual localization can be used to overcome challenges of viewpoint changes.
Specifically, we focus on the problem of selecting the optimal viewpoint at a given location.
The result demonstrates the superior performance of the data-driven approach when compared to existing methods.
arXiv Detail & Related papers (2023-10-04T08:18:30Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Towards Multimodal Multitask Scene Understanding Models for Indoor
Mobile Agents [49.904531485843464]
In this paper, we discuss the main challenge: insufficient, or even no, labeled data for real-world indoor environments.
We describe MMISM (Multi-modality input Multi-task output Indoor Scene understanding Model) to tackle the above challenges.
MMISM considers RGB images as well as sparse Lidar points as inputs and 3D object detection, depth completion, human pose estimation, and semantic segmentation as output tasks.
We show that MMISM performs on par or even better than single-task models.
arXiv Detail & Related papers (2022-09-27T04:49:19Z) - Learning Cross-Scale Visual Representations for Real-Time Image
Geo-Localization [21.375640354558044]
State estimation approaches based on local sensors are drifting-prone for long-range missions as error accumulates.
We introduce the cross-scale dataset and a methodology to produce additional data from cross-modality sources.
We propose a framework that learns cross-scale visual representations without supervision.
arXiv Detail & Related papers (2021-09-09T08:08:54Z) - MAOMaps: A Photo-Realistic Benchmark For vSLAM and Map Merging Quality
Assessment [0.0]
We introduce a novel benchmark that is aimed at quantitatively evaluating the quality of vision-based simultaneous localization and mapping (vSLAM) and map merging algorithms.
The dataset is photo-realistic and provides both the localization and the map ground truth data.
To compare the vSLAM-built maps and the ground-truth ones we introduce a novel way to find correspondences between them that takes the SLAM context into account.
arXiv Detail & Related papers (2021-05-31T14:30:36Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Gravitational Models Explain Shifts on Human Visual Attention [80.76475913429357]
Visual attention refers to the human brain's ability to select relevant sensory information for preferential processing.
Various methods to estimate saliency have been proposed in the last three decades.
We propose a gravitational model (GRAV) to describe the attentional shifts.
arXiv Detail & Related papers (2020-09-15T10:12:41Z) - Camera-Lidar Integration: Probabilistic sensor fusion for semantic
mapping [8.18198392834469]
An automated vehicle must be able to perceive and recognise object/obstacles in a three-dimensional world while navigating in a constantly changing environment.
We present a probabilistic pipeline that incorporates uncertainties from the sensor readings (cameras, lidar, IMU and wheel encoders), compensation for the motion of the vehicle, and label probabilities for the semantic images.
arXiv Detail & Related papers (2020-07-09T07:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.