Related papers: Active Semantic Localization with Graph Neural Embedding

Active Semantic Localization with Graph Neural Embedding

URL: http://arxiv.org/abs/2305.06141v5
Date: Tue, 26 Dec 2023 05:11:58 GMT
Title: Active Semantic Localization with Graph Neural Embedding
Authors: Mitsuki Yoshida, Kanji Tanaka, Ryogo Yamamoto, and Daiki Iwata
Abstract summary: In this work, we explore a lightweight, entirely CPU-based, domain-adaptive semantic localization framework, called graph neural localizer. Our approach is inspired by two recently emerging technologies: (1) Scene graph, which combines the viewpoint- and appearance- invariance of local and global features; (2) Graph neural network, which enables direct learning/recognition of graph data. Experiments on two scenarios, self-supervised learning and unsupervised domain adaptation, using a photo-realistic Habitat simulator validate the effectiveness of the proposed method.
Score: 1.3499500088995464
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic localization, i.e., robot self-localization with semantic image modality, is critical in recently emerging embodied AI applications (e.g., point-goal navigation, object-goal navigation, vision language navigation) and topological mapping applications (e.g., graph neural SLAM, ego-centric topological map). However, most existing works on semantic localization focus on passive vision tasks without viewpoint planning, or rely on additional rich modalities (e.g., depth measurements). Thus, the problem is largely unsolved. In this work, we explore a lightweight, entirely CPU-based, domain-adaptive semantic localization framework, called graph neural localizer. Our approach is inspired by two recently emerging technologies: (1) Scene graph, which combines the viewpoint- and appearance- invariance of local and global features; (2) Graph neural network, which enables direct learning/recognition of graph data (i.e., non-vector data). Specifically, a graph convolutional neural network is first trained as a scene graph classifier for passive vision, and then its knowledge is transferred to a reinforcement-learning planner for active vision. Experiments on two scenarios, self-supervised learning and unsupervised domain adaptation, using a photo-realistic Habitat simulator validate the effectiveness of the proposed method.

Related papers

Brain Network Classification Based on Graph Contrastive Learning and Graph Transformer [0.6906005491572401]
This paper proposes a novel model named PHGCL-DDGformer that integrates graph contrastive learning with graph transformers. Experimental results on real-world datasets demonstrate that the PHGCL-DDGformer model outperforms existing state-of-the-art approaches in brain network classification tasks.
arXiv Detail & Related papers (2025-04-01T13:26:03Z)
Towards Two-Stream Foveation-based Active Vision Learning [7.14325008286629]
"Two-stream hypothesis" from neuroscience explains the neural processing in the human visual cortex as an active vision system. We propose a machine learning framework inspired by the "two-stream hypothesis" and explore the potential benefits that it offers. We show that the two-stream foveation-based learning is applicable to the challenging task of weakly-supervised object localization.
arXiv Detail & Related papers (2024-03-24T01:20:08Z)
GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based Histogram Intersection [51.608147732998994]
Graph neural networks are increasingly becoming the framework of choice for graph-based machine learning. We propose a new graph neural network architecture that substitutes classical message passing with an analysis of the local distribution of node features.
arXiv Detail & Related papers (2024-01-17T13:04:23Z)
Domain Adaptive Graph Classification [0.0]
We introduce the Dual Adversarial Graph Representation Learning (DAGRL), which explore the graph topology from dual branches and mitigate domain discrepancies via dual adversarial learning. Our approach incorporates adaptive perturbations into the dual branches, which align the source and target distribution to address domain discrepancies.
arXiv Detail & Related papers (2023-12-21T02:37:56Z)
The Map Equation Goes Neural: Mapping Network Flows with Graph Neural Networks [0.716879432974126]
Community detection is an essential tool for unsupervised data exploration and revealing the organisational structure of networked systems. We consider the map equation, a popular information-theoretic objective function for unsupervised community detection, and express it in differentiable tensor form for gradient through descent. Our formulation turns the map equation compatible with any neural network architecture, enables end-to-end learning, incorporates node features, and chooses the optimal number of clusters automatically.
arXiv Detail & Related papers (2023-10-02T12:32:18Z)
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective [71.03621840455754]
Graph Neural Networks (GNNs) have gained momentum in graph representation learning. graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation. This paper presents a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective.
arXiv Detail & Related papers (2022-09-27T08:10:14Z)
FuNNscope: Visual microscope for interactively exploring the loss landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks. We generalize observations on small neural networks to more complex systems. An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z)
Self-Supervised Graph Representation Learning for Neuronal Morphologies [75.38832711445421]
We present GraphDINO, a data-driven approach to learn low-dimensional representations of 3D neuronal morphologies from unlabeled datasets. We show, in two different species and across multiple brain areas, that this method yields morphological cell type clusterings on par with manual feature-based classification by experts. Our method could potentially enable data-driven discovery of novel morphological features and cell types in large-scale datasets.
arXiv Detail & Related papers (2021-12-23T12:17:47Z)
Map-Based Temporally Consistent Geolocalization through Learning Motion Trajectories [0.5076419064097732]
We propose a novel trajectory learning method that exploits motion trajectories on topological map using recurrent neural network. Inspired by human's ability to both be aware of distance and direction of self-motion in navigation, our trajectory learning method learns a pattern representation of trajectories encoded as a sequence of distances and turning angles to assist self-localization.
arXiv Detail & Related papers (2020-10-13T02:08:45Z)
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems. We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework. We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
Machine Learning on Graphs: A Model and Comprehensive Taxonomy [22.73365477040205]
We bridge the gap between graph neural networks, network embedding and graph regularization models. Specifically, we propose a Graph Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs.
arXiv Detail & Related papers (2020-05-07T18:00:02Z)
Structured Landmark Detection via Topology-Adapting Deep Graph Learning [75.20602712947016]
We present a new topology-adapting deep graph learning approach for accurate anatomical facial and medical landmark detection. The proposed method constructs graph signals leveraging both local image features and global shape features. Experiments are conducted on three public facial image datasets (WFLW, 300W, and COFW-68) as well as three real-world X-ray medical datasets (Cephalometric (public), Hand and Pelvis)
arXiv Detail & Related papers (2020-04-17T11:55:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.