Semantic SLAM with Autonomous Object-Level Data Association
- URL: http://arxiv.org/abs/2011.10625v1
- Date: Fri, 20 Nov 2020 20:33:39 GMT
- Title: Semantic SLAM with Autonomous Object-Level Data Association
- Authors: Zhentian Qian, Kartik Patath, Jie Fu, Jing Xiao
- Abstract summary: semantic-level SLAM system can achieve high-accuracy object-level data association and real-time semantic mapping.
Online semantic map building and semantic-level localization capabilities facilitate semantic-level mapping and task planning in a priori unknown environment.
- Score: 31.707650560075976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is often desirable to capture and map semantic information of an
environment during simultaneous localization and mapping (SLAM). Such semantic
information can enable a robot to better distinguish places with similar
low-level geometric and visual features and perform high-level tasks that use
semantic information about objects to be manipulated and environments to be
navigated. While semantic SLAM has gained increasing attention, there is little
research on semanticlevel data association based on semantic objects, i.e.,
object-level data association. In this paper, we propose a novel object-level
data association algorithm based on bag of words algorithm, formulated as a
maximum weighted bipartite matching problem. With object-level data association
solved, we develop a quadratic-programming-based semantic object initialization
scheme using dual quadric and introduce additional constraints to improve the
success rate of object initialization. The integrated semantic-level SLAM
system can achieve high-accuracy object-level data association and real-time
semantic mapping as demonstrated in the experiments. The online semantic map
building and semantic-level localization capabilities facilitate semantic-level
mapping and task planning in a priori unknown environment.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - LOSS-SLAM: Lightweight Open-Set Semantic Simultaneous Localization and Mapping [9.289001828243512]
We show that a system of identifying, localizing, and encoding objects is tightly coupled with probabilistic graphical models for performing open-set semantic simultaneous localization and mapping (SLAM)
Results are presented demonstrating that the proposed lightweight object encoding can be used to perform more accurate object-based SLAM than existing open-set methods.
arXiv Detail & Related papers (2024-04-05T19:42:55Z) - Exploiting Contextual Target Attributes for Target Sentiment
Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task.
We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z) - Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos [63.94040814459116]
Self-supervised methods have shown remarkable progress in learning high-level semantics and low-level temporal correspondence.
We propose a novel semantic-aware masked slot attention on top of the fused semantic features and correspondence maps.
We adopt semantic- and instance-level temporal consistency as self-supervision to encourage temporally coherent object-centric representations.
arXiv Detail & Related papers (2023-08-19T09:12:13Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - An Object SLAM Framework for Association, Mapping, and High-Level Tasks [12.62957558651032]
We present a comprehensive object SLAM framework that focuses on object-based perception and object-oriented robot tasks.
A range of public datasets and real-world results have been used to evaluate the proposed object SLAM framework for its efficient performance.
arXiv Detail & Related papers (2023-05-12T08:10:14Z) - Learning-based Relational Object Matching Across Views [63.63338392484501]
We propose a learning-based approach which combines local keypoints with novel object-level features for matching object detections between RGB images.
We train our object-level matching features based on appearance and inter-frame and cross-frame spatial relations between objects in an associative graph neural network.
arXiv Detail & Related papers (2023-05-03T19:36:51Z) - Loop Closure Detection Based on Object-level Spatial Layout and Semantic
Consistency [14.694754836704819]
We present an object-based loop closure detection method based on the spatial layout and semanic consistency of the 3D scene graph.
Experimental results demonstrate that our proposed data association approach can construct more accurate 3D semantic maps.
arXiv Detail & Related papers (2023-04-11T11:20:51Z) - Cascaded Semantic and Positional Self-Attention Network for Document
Classification [9.292885582770092]
We propose a new architecture to aggregate the two sources of information using cascaded semantic and positional self-attention network (CSPAN)
The CSPAN uses a semantic self-attention layer cascaded with Bi-LSTM to process the semantic and positional information in a sequential manner, and then adaptively combine them together through a residue connection.
We evaluate the CSPAN model on several benchmark data sets for document classification with careful ablation studies, and demonstrate the encouraging results compared with state of the art.
arXiv Detail & Related papers (2020-09-15T15:02:28Z) - GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation
in the Wild [23.29789882934198]
We propose a framework combining higher object-level context conditioning and part-level spatial relationships to address the task.
To tackle object-level ambiguity, a class-conditioning module is introduced to retain class-level semantics.
We also propose a novel adjacency graph-based module that aims at matching the relative spatial relationships between ground truth and predicted parts.
arXiv Detail & Related papers (2020-07-17T15:53:40Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.