Semantic Visual Simultaneous Localization and Mapping: A Survey
- URL: http://arxiv.org/abs/2209.06428v1
- Date: Wed, 14 Sep 2022 05:45:26 GMT
- Title: Semantic Visual Simultaneous Localization and Mapping: A Survey
- Authors: Kaiqi Chen, Jianhua Zhang, Jialing Liu, Qiyi Tong, Ruyu Liu, Shengyong
Chen
- Abstract summary: This paper first reviews the development of semantic vSLAM, explicitly focusing on its strengths and differences.
Secondly, we explore three main issues of semantic vSLAM: the extraction and association of semantic information, the application of semantic information, and the advantages of semantic vSLAM.
Finally, we discuss future directions that will provide a blueprint for the future development of semantic vSLAM.
- Score: 18.372996585079235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Simultaneous Localization and Mapping (vSLAM) has achieved great
progress in the computer vision and robotics communities, and has been
successfully used in many fields such as autonomous robot navigation and AR/VR.
However, vSLAM cannot achieve good localization in dynamic and complex
environments. Numerous publications have reported that, by combining with the
semantic information with vSLAM, the semantic vSLAM systems have the capability
of solving the above problems in recent years. Nevertheless, there is no
comprehensive survey about semantic vSLAM. To fill the gap, this paper first
reviews the development of semantic vSLAM, explicitly focusing on its strengths
and differences. Secondly, we explore three main issues of semantic vSLAM: the
extraction and association of semantic information, the application of semantic
information, and the advantages of semantic vSLAM. Then, we collect and analyze
the current state-of-the-art SLAM datasets which have been widely used in
semantic vSLAM systems. Finally, we discuss future directions that will provide
a blueprint for the future development of semantic vSLAM.
Related papers
- Large Action Models: From Inception to Implementation [51.81485642442344]
Large Action Models (LAMs) are designed for action generation and execution within dynamic environments.
LAMs hold the potential to transform AI from passive language understanding to active task completion.
We present a comprehensive framework for developing LAMs, offering a systematic approach to their creation, from inception to deployment.
arXiv Detail & Related papers (2024-12-13T11:19:56Z) - SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking [89.43370214059955]
Open-vocabulary Multiple Object Tracking (MOT) aims to generalize trackers to novel categories not in the training set.
We present a unified framework that jointly considers semantics, location, and appearance priors in the early steps of association.
Our method eliminates complex post-processings for fusing different cues and boosts the association performance significantly for large-scale open-vocabulary tracking.
arXiv Detail & Related papers (2024-09-17T14:36:58Z) - When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models [113.18524940863841]
This survey provides a comprehensive overview of the methodologies enabling large language models to process, understand, and generate 3D data.
Our investigation spans various 3D data representations, from point clouds to Neural Radiance Fields (NeRFs)
It examines their integration with LLMs for tasks such as 3D scene understanding, captioning, question-answering, and dialogue.
arXiv Detail & Related papers (2024-05-16T16:59:58Z) - TextSLAM: Visual SLAM with Semantic Planar Text Features [8.8100408194584]
We propose a novel visual SLAM method that integrates text objects tightly by treating them as semantic features.
We tested our method in various scenes with the ground truth data.
The results show that integrating texture features leads to a more superior SLAM system that can match images across day and night.
arXiv Detail & Related papers (2023-05-17T08:16:26Z) - Visual SLAM: What are the Current Trends and What to Expect? [0.0]
Vision-based sensors have shown significant performance, accuracy, and efficiency gain in Simultaneous localization and Mapping (SLAM) systems.
We have given an in-depth literature survey of forty-five impactful papers published in the domain of VSLAMs.
arXiv Detail & Related papers (2022-10-19T11:56:32Z) - Det-SLAM: A semantic visual SLAM for highly dynamic scenes using
Detectron2 [0.0]
This research combines the visual SLAM systems ORB-SLAM3 and Detectron2 to present the Det-SLAM system.
Det-SLAM is more resilient than previous dynamic SLAM systems and can lower the estimated error of camera posture in dynamic indoor scenarios.
arXiv Detail & Related papers (2022-10-01T13:25:11Z) - A Review on Visual-SLAM: Advancements from Geometric Modelling to
Learning-based Semantic Scene Understanding [3.0839245814393728]
Simultaneous Localisation and Mapping (SLAM) is one of the fundamental problems in autonomous mobile robots.
Visual-SLAM uses various sensors from the mobile robot for collecting and sensing a representation of the map.
Recent advancements in computer vision, such as deep learning techniques, have provided a data-driven approach to tackle the Visual-SLAM problem.
arXiv Detail & Related papers (2022-09-12T13:11:25Z) - Boosting Video-Text Retrieval with Explicit High-Level Semantics [115.66219386097295]
We propose a novel visual-linguistic aligning model named HiSE for VTR.
It improves the cross-modal representation by incorporating explicit high-level semantics.
Our method achieves the superior performance over state-of-the-art methods on three benchmark datasets.
arXiv Detail & Related papers (2022-08-08T15:39:54Z) - NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [112.6093688226293]
NICE-SLAM is a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation.
Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust.
arXiv Detail & Related papers (2021-12-22T18:45:44Z) - Named Entity Recognition for Social Media Texts with Semantic
Augmentation [70.44281443975554]
Existing approaches for named entity recognition suffer from data sparsity problems when conducted on short and informal texts.
We propose a neural-based approach to NER for social media texts where both local (from running text) and augmented semantics are taken into account.
arXiv Detail & Related papers (2020-10-29T10:06:46Z) - Map-merging Algorithms for Visual SLAM: Feasibility Study and Empirical
Evaluation [0.0]
State-of-the-art vSLAM algorithms are capable of constructing accurate-enough maps that enable a mobile robot to autonomously navigate an unknown environment.
This problem asks whether different vSLAM maps can be merged into a consistent single representation.
We examine the existing 2D and 3D map-merging algorithms and conduct an extensive empirical evaluation in realistic simulated environment.
arXiv Detail & Related papers (2020-09-12T16:15:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.