A Review on Visual-SLAM: Advancements from Geometric Modelling to
Learning-based Semantic Scene Understanding
- URL: http://arxiv.org/abs/2209.05222v1
- Date: Mon, 12 Sep 2022 13:11:25 GMT
- Title: A Review on Visual-SLAM: Advancements from Geometric Modelling to
Learning-based Semantic Scene Understanding
- Authors: Tin Lai
- Abstract summary: Simultaneous Localisation and Mapping (SLAM) is one of the fundamental problems in autonomous mobile robots.
Visual-SLAM uses various sensors from the mobile robot for collecting and sensing a representation of the map.
Recent advancements in computer vision, such as deep learning techniques, have provided a data-driven approach to tackle the Visual-SLAM problem.
- Score: 3.0839245814393728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous Localisation and Mapping (SLAM) is one of the fundamental
problems in autonomous mobile robots where a robot needs to reconstruct a
previously unseen environment while simultaneously localising itself with
respect to the map. In particular, Visual-SLAM uses various sensors from the
mobile robot for collecting and sensing a representation of the map.
Traditionally, geometric model-based techniques were used to tackle the SLAM
problem, which tends to be error-prone under challenging environments. Recent
advancements in computer vision, such as deep learning techniques, have
provided a data-driven approach to tackle the Visual-SLAM problem. This review
summarises recent advancements in the Visual-SLAM domain using various
learning-based methods. We begin by providing a concise overview of the
geometric model-based approaches, followed by technical reviews on the current
paradigms in SLAM. Then, we present the various learning-based approaches to
collecting sensory inputs from mobile robots and performing scene
understanding. The current paradigms in deep-learning-based semantic
understanding are discussed and placed under the context of Visual-SLAM.
Finally, we discuss challenges and further opportunities in the direction of
learning-based approaches in Visual-SLAM.
Related papers
- Recent Trends in 3D Reconstruction of General Non-Rigid Scenes [104.07781871008186]
Reconstructing models of the real world, including 3D geometry, appearance, and motion of real scenes, is essential for computer graphics and computer vision.
It enables the synthesizing of photorealistic novel views, useful for the movie industry and AR/VR applications.
This state-of-the-art report (STAR) offers the reader a comprehensive summary of state-of-the-art techniques with monocular and multi-view inputs.
arXiv Detail & Related papers (2024-03-22T09:46:11Z) - MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting [97.52388851329667]
We introduce Marking Open-world Keypoint Affordances (MOKA) to solve robotic manipulation tasks specified by free-form language instructions.
Central to our approach is a compact point-based representation of affordance, which bridges the VLM's predictions on observed images and the robot's actions in the physical world.
We evaluate and analyze MOKA's performance on various table-top manipulation tasks including tool use, deformable body manipulation, and object rearrangement.
arXiv Detail & Related papers (2024-03-05T18:08:45Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Semantic Visual Simultaneous Localization and Mapping: A Survey [18.372996585079235]
This paper first reviews the development of semantic vSLAM, explicitly focusing on its strengths and differences.
Secondly, we explore three main issues of semantic vSLAM: the extraction and association of semantic information, the application of semantic information, and the advantages of semantic vSLAM.
Finally, we discuss future directions that will provide a blueprint for the future development of semantic vSLAM.
arXiv Detail & Related papers (2022-09-14T05:45:26Z) - Panoramic Learning with A Standardized Machine Learning Formalism [116.34627789412102]
This paper presents a standardized equation of the learning objective, that offers a unifying understanding of diverse ML algorithms.
It also provides guidance for mechanic design of new ML solutions, and serves as a promising vehicle towards panoramic learning with all experiences.
arXiv Detail & Related papers (2021-08-17T17:44:38Z) - Neural Networks for Semantic Gaze Analysis in XR Settings [0.0]
We present a novel approach which minimizes time and information necessary to annotate volumes of interest.
We train convolutional neural networks (CNNs) on synthetic data sets derived from virtual models using image augmentation techniques.
We evaluate our method in real and virtual environments, showing that the method can compete with state-of-the-art approaches.
arXiv Detail & Related papers (2021-03-18T18:05:01Z) - ViNG: Learning Open-World Navigation with Visual Goals [82.84193221280216]
We propose a learning-based navigation system for reaching visually indicated goals.
We show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.
We demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection.
arXiv Detail & Related papers (2020-12-17T18:22:32Z) - A Survey on Deep Learning for Localization and Mapping: Towards the Age
of Spatial Machine Intelligence [48.67755344239951]
We provide a comprehensive survey, and propose a new taxonomy for localization and mapping using deep learning.
A wide range of topics are covered, from learning odometry estimation, mapping, to global localization and simultaneous localization and mapping.
It is our hope that this work can connect emerging works from robotics, computer vision and machine learning communities.
arXiv Detail & Related papers (2020-06-22T19:01:21Z) - Meta-Learning in Neural Networks: A Survey [4.588028371034406]
This survey describes the contemporary meta-learning landscape.
We first discuss definitions of meta-learning and position it with respect to related fields.
We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods.
arXiv Detail & Related papers (2020-04-11T16:34:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.