Enhancing Graph Representation of the Environment through Local and
Cloud Computation
- URL: http://arxiv.org/abs/2309.12692v1
- Date: Fri, 22 Sep 2023 08:05:32 GMT
- Title: Enhancing Graph Representation of the Environment through Local and
Cloud Computation
- Authors: Francesco Argenziano, Vincenzo Suriani and Daniele Nardi
- Abstract summary: We propose a graph-based representation that provides a semantic representation of robot environments from multiple sources.
To acquire information from the environment, the framework combines classical computer vision tools with modern computer vision cloud services.
The proposed approach allows us to handle also small objects and integrate them into the semantic representation of the environment.
- Score: 2.9465623430708905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Enriching the robot representation of the operational environment is a
challenging task that aims at bridging the gap between low-level sensor
readings and high-level semantic understanding. Having a rich representation
often requires computationally demanding architectures and pure point cloud
based detection systems that struggle when dealing with everyday objects that
have to be handled by the robot. To overcome these issues, we propose a
graph-based representation that addresses this gap by providing a semantic
representation of robot environments from multiple sources. In fact, to acquire
information from the environment, the framework combines classical computer
vision tools with modern computer vision cloud services, ensuring computational
feasibility on onboard hardware. By incorporating an ontology hierarchy with
over 800 object classes, the framework achieves cross-domain adaptability,
eliminating the need for environment-specific tools. The proposed approach
allows us to handle also small objects and integrate them into the semantic
representation of the environment. The approach is implemented in the Robot
Operating System (ROS) using the RViz visualizer for environment
representation. This work is a first step towards the development of a
general-purpose framework, to facilitate intuitive interaction and navigation
across different domains.
Related papers
- Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot [0.8515309662618664]
This paper presents a robot control architecture that addresses key challenges in human-robot interaction.
The architecture uses Large Language Models to integrate diverse information sources, including natural language commands.
The architecture enhances adaptability, task efficiency, and human-robot collaboration in dynamic environments.
arXiv Detail & Related papers (2024-11-22T15:58:26Z) - ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments [13.988804095409133]
We propose the ReALFRED benchmark that employs real-world scenes, objects, and room layouts to learn agents to complete household tasks.
Specifically, we extend the ALFRED benchmark with updates for larger environmental spaces with smaller visual domain gaps.
With ReALFRED, we analyze previously crafted methods for the ALFRED benchmark and observe that they consistently yield lower performance in all metrics.
arXiv Detail & Related papers (2024-07-26T07:00:27Z) - Cognitive Planning for Object Goal Navigation using Generative AI Models [0.979851640406258]
We present a novel framework for solving the object goal navigation problem that generates efficient exploration strategies.
Our approach enables a robot to navigate unfamiliar environments by leveraging Large Language Models (LLMs) and Large Vision-Language Models (LVLMs)
arXiv Detail & Related papers (2024-03-30T10:54:59Z) - Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - Graphical Object-Centric Actor-Critic [55.2480439325792]
We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches.
We use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment.
Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm.
arXiv Detail & Related papers (2023-10-26T06:05:12Z) - SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World
Semantic Scene Understanding [34.19666841489646]
We show how a robot can autonomously discover novel semantic classes and improve accuracy on known classes when exploring an unknown environment.
We develop a general framework for mapping and clustering that we then use to generate a self-supervised learning signal to update a semantic segmentation model.
In particular, we show how clustering parameters can be optimized during deployment and that fusion of multiple observation modalities improves novel object discovery compared to prior work.
arXiv Detail & Related papers (2022-06-21T18:41:51Z) - GraphMapper: Efficient Visual Navigation by Scene Graph Generation [13.095640044666348]
We propose a method to train an autonomous agent to learn to accumulate a 3D scene graph representation of its environment.
We show that our approach, GraphMapper, can act as a modular scene encoder to operate alongside existing Learning-based solutions.
arXiv Detail & Related papers (2022-05-17T13:21:20Z) - Optical flow-based branch segmentation for complex orchard environments [73.11023209243326]
We train a neural network system in simulation only using simulated RGB data and optical flow.
This resulting neural network is able to perform foreground segmentation of branches in a busy orchard environment without additional real-world training or using any special setup or equipment beyond a standard camera.
Our results show that our system is highly accurate and, when compared to a network using manually labeled RGBD data, achieves significantly more consistent and robust performance across environments that differ from the training set.
arXiv Detail & Related papers (2022-02-26T03:38:20Z) - OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer
Learning for Telepresence Robotics [124.08684545010664]
Scene graph generation from images is a task of great interest to applications such as robotics.
We propose an initial approximation to a framework called Ontology-Guided Scene Graph Generation (OG-SGG)
arXiv Detail & Related papers (2022-02-21T13:23:15Z) - RICE: Refining Instance Masks in Cluttered Environments with Graph
Neural Networks [53.15260967235835]
We propose a novel framework that refines the output of such methods by utilizing a graph-based representation of instance masks.
We train deep networks capable of sampling smart perturbations to the segmentations, and a graph neural network, which can encode relations between objects, to evaluate the segmentations.
We demonstrate an application that uses uncertainty estimates generated by our method to guide a manipulator, leading to efficient understanding of cluttered scenes.
arXiv Detail & Related papers (2021-06-29T20:29:29Z) - SAPIEN: A SimulAted Part-based Interactive ENvironment [77.4739790629284]
SAPIEN is a realistic and physics-rich simulated environment that hosts a large-scale set for articulated objects.
We evaluate state-of-the-art vision algorithms for part detection and motion attribute recognition as well as demonstrate robotic interaction tasks.
arXiv Detail & Related papers (2020-03-19T00:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.