Online Knowledge Integration for 3D Semantic Mapping: A Survey
- URL: http://arxiv.org/abs/2411.18147v1
- Date: Wed, 27 Nov 2024 08:53:16 GMT
- Title: Online Knowledge Integration for 3D Semantic Mapping: A Survey
- Authors: Felix Igelbrink, Marian Renz, Martin Günther, Piper Powell, Lennart Niecksch, Oscar Lima, Martin Atzmueller, Joachim Hertzberg,
- Abstract summary: Recent advances in deep learning allow full integration of prior knowledge, represented as knowledge graphs or language concepts, into sensor data processing and semantic mapping pipelines.<n>Semantic scene graphs and language models enable modern semantic mapping approaches to incorporate graph-based prior knowledge or to leverage the rich information in human language both during and after the mapping process.
- Score: 3.6437757915645386
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Semantic mapping is a key component of robots operating in and interacting with objects in structured environments. Traditionally, geometric and knowledge representations within a semantic map have only been loosely integrated. However, recent advances in deep learning now allow full integration of prior knowledge, represented as knowledge graphs or language concepts, into sensor data processing and semantic mapping pipelines. Semantic scene graphs and language models enable modern semantic mapping approaches to incorporate graph-based prior knowledge or to leverage the rich information in human language both during and after the mapping process. This has sparked substantial advances in semantic mapping, leading to previously impossible novel applications. This survey reviews these recent developments comprehensively, with a focus on online integration of knowledge into semantic mapping. We specifically focus on methods using semantic scene graphs for integrating symbolic prior knowledge and language models for respective capture of implicit common-sense knowledge and natural language concepts
Related papers
- LIEREx: Language-Image Embeddings for Robotic Exploration [1.8630958726929938]
Traditional mapping approaches provide accurate geometric representations but are often constrained by pre-designed symbolic vocabularies.<n>Recent advances in Vision-Language Foundation Models, such as CLIP, enable open-set mapping, where objects are encoded as high-dimensional embeddings rather than fixed labels.<n>In LIEREx, we integrate these VLFMs with established 3D Semantic Scene Graphs to enable target-directed exploration by an autonomous agent in partially unknown environments.
arXiv Detail & Related papers (2026-02-02T10:30:50Z) - FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment [16.987872206495897]
FindAnything is an open-world mapping framework that incorporates vision-language information into dense volumetric submaps.
Our system is the first of its kind to be deployed on resource-constrained devices, such as MAVs.
arXiv Detail & Related papers (2025-04-11T15:12:05Z) - Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge.
The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes.
We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z) - Mapping High-level Semantic Regions in Indoor Environments without
Object Recognition [50.624970503498226]
The present work proposes a method for semantic region mapping via embodied navigation in indoor environments.
To enable region identification, the method uses a vision-to-language model to provide scene information for mapping.
By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location.
arXiv Detail & Related papers (2024-03-11T18:09:50Z) - CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation [25.56539617837482]
A novel context-aware graph-attention model (Context-aware GAT) is proposed.
It assimilates global features from relevant knowledge graphs through a context-enhanced knowledge aggregation mechanism.
Empirical results demonstrate that our framework outperforms conventional GNN-based language models in terms of performance.
arXiv Detail & Related papers (2023-05-10T16:31:35Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Imitation Learning-based Implicit Semantic-aware Communication Networks:
Multi-layer Representation and Collaborative Reasoning [68.63380306259742]
Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy.
We propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate.
We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users.
arXiv Detail & Related papers (2022-10-28T13:26:08Z) - Joint Language Semantic and Structure Embedding for Knowledge Graph
Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.
Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models.
Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z) - Explainable Semantic Space by Grounding Language to Vision with
Cross-Modal Contrastive Learning [3.441021278275805]
We design a two-stream model for grounding language learning in vision.
The model first learns to align visual and language representations with the MS COCO dataset.
After training, the language stream of this model is a stand-alone language model capable of embedding concepts in a visually grounded semantic space.
arXiv Detail & Related papers (2021-11-13T19:54:15Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z) - Distributional semantic modeling: a revised technique to train term/word
vector space models applying the ontology-related approach [36.248702416150124]
We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings)
Vec2graph is a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs.
arXiv Detail & Related papers (2020-03-06T18:27:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.