Related papers: OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics

OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics

URL: http://arxiv.org/abs/2202.10201v1
Date: Mon, 21 Feb 2022 13:23:15 GMT
Title: OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics
Authors: Fernando Amodeo, Fernando Caballero, Natalia D\'iaz-Rodr\'iguez, Luis Merino
Abstract summary: Scene graph generation from images is a task of great interest to applications such as robotics. We propose an initial approximation to a framework called Ontology-Guided Scene Graph Generation (OG-SGG)
Score: 124.08684545010664
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Scene graph generation from images is a task of great interest to applications such as robotics, because graphs are the main way to represent knowledge about the world and regulate human-robot interactions in tasks such as Visual Question Answering (VQA). Unfortunately, its corresponding area of machine learning is still relatively in its infancy, and the solutions currently offered do not specialize well in concrete usage scenarios. Specifically, they do not take existing "expert" knowledge about the domain world into account; and that might indeed be necessary in order to provide the level of reliability demanded by the use case scenarios. In this paper, we propose an initial approximation to a framework called Ontology-Guided Scene Graph Generation (OG-SGG), that can improve the performance of an existing machine learning based scene graph generator using prior knowledge supplied in the form of an ontology; and we present results evaluated on a specific scenario founded in telepresence robotics.

Related papers

Hi-Dyna Graph: Hierarchical Dynamic Scene Graph for Robotic Autonomy in Human-Centric Environments [41.80879866951797]
Hi-Dyna Graph is a hierarchical dynamic scene graph architecture that integrates persistent global layouts with localized dynamic semantics for embodied robotic autonomy.<n>An agent powered by large language models (LLMs) is employed to interpret the unified graph, infer latent task triggers, and generate executable instructions grounded in robotic affordances.
arXiv Detail & Related papers (2025-05-30T03:35:29Z)
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes [0.0]
We introduce DyGEnc - a novel method for.<n>a Dynamic Graph.<n>This method integrates compressed spatial-temporal structural observation with the cognitive capabilities of large language models.<n>DyGEnc outperforms existing visual methods by a large margin of 15-25% in addressing queries regarding the history of human-to-object interactions.
arXiv Detail & Related papers (2025-05-06T14:41:42Z)
Generative Visual Commonsense Answering and Explaining with Generative Scene Graph Constructing [46.701439459096235]
We propose a novel visual commonsense reasoning generation method named textittextbfG2. It first utilizes the image patches and LLMs to construct a location-free scene graph, and then answer and explain based on the scene graph's information. We also propose automatic scene graph filtering and selection strategies to absorb valuable scene graph information during training.
arXiv Detail & Related papers (2025-01-15T04:00:36Z)
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction [69.57190742976091]
We introduce Aguvis, a unified vision-based framework for autonomous GUI agents. Our approach leverages image-based observations, and grounding instructions in natural language to visual elements. To address the limitations of previous work, we integrate explicit planning and reasoning within the model.
arXiv Detail & Related papers (2024-12-05T18:58:26Z)
Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information [68.10033984296247]
This paper explores the domain of active localization, emphasizing the importance of viewpoint selection to enhance localization accuracy. Our contributions involve using a data-driven approach with a simple architecture designed for real-time operation, a self-supervised data training method, and the capability to consistently integrate our map into a planning framework tailored for real-world robotics applications.
arXiv Detail & Related papers (2024-07-22T12:32:09Z)
Graph learning in robotics: a survey [2.5726566614123874]
The paper covers the fundamentals of graph-based models, including their architecture, training procedures, and applications. It also discusses recent advancements and challenges that arise in applied settings. The paper provides an extensive review of various robotic applications that benefit from learning on graph structures.
arXiv Detail & Related papers (2023-10-06T14:52:25Z)
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs [81.15889805560333]
We present SG-Bot, a novel rearrangement framework. SG-Bot exemplifies lightweight, real-time, and user-controllable characteristics. Experimental results demonstrate that SG-Bot outperforms competitors by a large margin.
arXiv Detail & Related papers (2023-09-21T15:54:33Z)
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective [71.03621840455754]
Graph Neural Networks (GNNs) have gained momentum in graph representation learning. graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation. This paper presents a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective.
arXiv Detail & Related papers (2022-09-27T08:10:14Z)
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation? [54.442692221567796]
Task specification is critical for engagement of non-expert end-users and adoption of personalized robots. A widely studied approach to task specification is through goals, using either compact state vectors or goal images from the same robot scene. In this work, we explore alternate and more general forms of goal specification that are expected to be easier for humans to specify and use.
arXiv Detail & Related papers (2022-04-23T19:39:49Z)
Situational Graphs for Robot Navigation in Structured Indoor Environments [9.13466172688693]
We present a real-time online built Situational Graphs (S-Graphs) composed of a single graph representing the environment. Our method utilizes odometry readings and planar surfaces extracted from 3D LiDAR scans, to construct and optimize in real-time a three layered S-Graph. Our proposal does not only demonstrate state-of-the-art results for pose estimation of the robot, but also contributes with a metric-semantic-topological model of the environment.
arXiv Detail & Related papers (2022-02-24T16:59:06Z)
Automated Graph Machine Learning: Approaches, Libraries, Benchmarks and Directions [58.220137936626315]
This paper extensively discusses automated graph machine learning approaches. We introduce AutoGL, our dedicated and the world's first open-source library for automated graph machine learning. Also, we describe a tailored benchmark that supports unified, reproducible, and efficient evaluations.
arXiv Detail & Related papers (2022-01-04T18:31:31Z)
An energy-based model for neuro-symbolic reasoning on knowledge graphs [0.0]
We propose an energy-based graph embedding algorithm to characterize industrial automation systems. By combining knowledge from multiple domains, the learned model is capable of making context-aware predictions. The presented model is mappable to a biologically-inspired neural architecture, serving as a first bridge between graph embedding methods and neuromorphic computing.
arXiv Detail & Related papers (2021-10-04T18:02:36Z)
Graph Neural Networks: Methods, Applications, and Opportunities [1.2183405753834562]
This article provides a comprehensive survey of graph neural networks (GNNs) in each learning setting. The approaches for each learning task are analyzed from both theoretical as well as empirical standpoints. Various applications and benchmark datasets are also provided, along with open challenges still plaguing the general applicability of GNNs.
arXiv Detail & Related papers (2021-08-24T13:46:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.