Topological Semantic Mapping by Consolidation of Deep Visual Features
- URL: http://arxiv.org/abs/2106.12709v1
- Date: Thu, 24 Jun 2021 01:10:03 GMT
- Title: Topological Semantic Mapping by Consolidation of Deep Visual Features
- Authors: Ygor C. N. Sousa, Hansenclever F. Bassani
- Abstract summary: This work introduces a topological semantic mapping method that uses deep visual features extracted by a CNN, the GoogLeNet, from 2D images captured in multiple views of the environment as the robot operates.
The experiments, performed using a real-world indoor dataset, showed that the method is able to consolidate the visual features of regions and use them to recognize objects and place categories as semantic properties.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many works in the recent literature introduce semantic mapping methods that
use CNNs (Convolutional Neural Networks) to recognize semantic properties in
images. The types of properties (eg.: room size, place category, and objects)
and their classes (eg.: kitchen and bathroom, for place category) are usually
predefined and restricted to a specific task. Thus, all the visual data
acquired and processed during the construction of the maps are lost and only
the recognized semantic properties remain on the maps. In contrast, this work
introduces a topological semantic mapping method that uses deep visual features
extracted by a CNN, the GoogLeNet, from 2D images captured in multiple views of
the environment as the robot operates, to create consolidated representations
of visual features acquired in the regions covered by each topological node.
These consolidated representations allow flexible recognition of semantic
properties of the regions and use in a range of visual tasks. The experiments,
performed using a real-world indoor dataset, showed that the method is able to
consolidate the visual features of regions and use them to recognize objects
and place categories as semantic properties, and to indicate the topological
location of images, with very promising results. The objects are classified
using the classification layer of GoogLeNet, without retraining, and the place
categories are recognized using a shallow Multilayer Perceptron.
Related papers
- Mapping High-level Semantic Regions in Indoor Environments without
Object Recognition [50.624970503498226]
The present work proposes a method for semantic region mapping via embodied navigation in indoor environments.
To enable region identification, the method uses a vision-to-language model to provide scene information for mapping.
By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location.
arXiv Detail & Related papers (2024-03-11T18:09:50Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - Contrastive learning of Class-agnostic Activation Map for Weakly
Supervised Object Localization and Semantic Segmentation [32.76127086403596]
We propose Contrastive learning for Class-agnostic Activation Map (C$2$AM) generation using unlabeled image data.
We form the positive and negative pairs based on the above relations and force the network to disentangle foreground and background.
As the network is guided to discriminate cross-image foreground-background, the class-agnostic activation maps learned by our approach generate more complete object regions.
arXiv Detail & Related papers (2022-03-25T08:46:24Z) - Segmentation in Style: Unsupervised Semantic Image Segmentation with
Stylegan and CLIP [39.0946507389324]
We introduce a method that allows to automatically segment images into semantically meaningful regions without human supervision.
Derived regions are consistent across different images and coincide with human-defined semantic classes on some datasets.
We test our method on publicly available datasets and show state-of-the-art results.
arXiv Detail & Related papers (2021-07-26T23:48:34Z) - Rethinking Semantic Segmentation Evaluation for Explainability and Model
Selection [12.786648212233116]
We introduce a new metric to assess region-based over- and under-segmentation.
We analyze and compare it to other metrics, demonstrating that the use of our metric lends greater explainability to semantic segmentation model performance in real-world applications.
arXiv Detail & Related papers (2021-01-21T03:12:43Z) - Weakly-Supervised Semantic Segmentation via Sub-category Exploration [73.03956876752868]
We propose a simple yet effective approach to enforce the network to pay attention to other parts of an object.
Specifically, we perform clustering on image features to generate pseudo sub-categories labels within each annotated parent class.
We conduct extensive analysis to validate the proposed method and show that our approach performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2020-08-03T20:48:31Z) - Hallucinating Saliency Maps for Fine-Grained Image Classification for
Limited Data Domains [27.91871214060683]
We propose an approach which does not require explicit saliency maps to improve image classification.
We show that our approach obtains similar results as the case when the saliency maps are provided explicitely.
In addition, we show that our saliency estimation method, which is trained without any saliency groundtruth data, obtains competitive results on real image saliency benchmark (Toronto)
arXiv Detail & Related papers (2020-07-24T15:08:55Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.