Modeling Dynamic Environments with Scene Graph Memory
- URL: http://arxiv.org/abs/2305.17537v4
- Date: Mon, 12 Jun 2023 17:25:06 GMT
- Title: Modeling Dynamic Environments with Scene Graph Memory
- Authors: Andrey Kurenkov, Michael Lingelbach, Tanmay Agarwal, Emily Jin,
Chengshu Li, Ruohan Zhang, Li Fei-Fei, Jiajun Wu, Silvio Savarese, Roberto
Mart\'in-Mart\'in
- Abstract summary: We present a new type of link prediction problem: link prediction on partially observable dynamic graphs.
Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges.
We propose a novel state representation -- Scene Graph Memory (SGM) -- with captures the agent's accumulated set of observations.
We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes.
- Score: 46.587536843634055
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Embodied AI agents that search for objects in large environments such as
households often need to make efficient decisions by predicting object
locations based on partial information. We pose this as a new type of link
prediction problem: link prediction on partially observable dynamic graphs. Our
graph is a representation of a scene in which rooms and objects are nodes, and
their relationships are encoded in the edges; only parts of the changing graph
are known to the agent at each timestep. This partial observability poses a
challenge to existing link prediction approaches, which we address. We propose
a novel state representation -- Scene Graph Memory (SGM) -- with captures the
agent's accumulated set of observations, as well as a neural net architecture
called a Node Edge Predictor (NEP) that extracts information from the SGM to
search efficiently. We evaluate our method in the Dynamic House Simulator, a
new benchmark that creates diverse dynamic graphs following the semantic
patterns typically seen at homes, and show that NEP can be trained to predict
the locations of objects in a variety of environments with diverse object
movement dynamics, outperforming baselines both in terms of new scene
adaptability and overall accuracy. The codebase and more can be found at
https://www.scenegraphmemory.com.
Related papers
- TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding [8.32401190051443]
We present the first implementation of an Equivariant Scene Graph Neural Network (ESGNN) to generate semantic scene graphs from 3D point clouds.
Our combined architecture, termed the Temporal Equivariant Scene Graph Neural Network (TESGNN), not only surpasses existing state-of-the-art methods in scene estimation accuracy but also achieves faster convergence.
arXiv Detail & Related papers (2024-11-15T15:39:04Z) - Local-Global Information Interaction Debiasing for Dynamic Scene Graph
Generation [51.92419880088668]
We propose a novel DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the local interaction information and global human-action interaction information.
Long-temporal human actions supervise the model to generate multiple scene graphs that conform to the global constraints and avoid the model being unable to learn the tail predicates.
arXiv Detail & Related papers (2023-08-10T01:24:25Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query.
We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE)
We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z) - GEMS: Scene Expansion using Generative Models of Graphs [3.5998698847215165]
We focus on one such representation, scene graphs, and propose a novel scene expansion task.
We first predict a new node and then predict the set of relationships between the newly predicted node and previous nodes in the graph.
We conduct extensive experiments on Visual Genome and VRD datasets to evaluate the expanded scene graphs.
arXiv Detail & Related papers (2022-07-08T07:41:28Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z) - Structural Temporal Graph Neural Networks for Anomaly Detection in
Dynamic Graphs [54.13919050090926]
We propose an end-to-end structural temporal Graph Neural Network model for detecting anomalous edges in dynamic graphs.
In particular, we first extract the $h$-hop enclosing subgraph centered on the target edge and propose the node labeling function to identify the role of each node in the subgraph.
Based on the extracted features, we utilize Gated recurrent units (GRUs) to capture the temporal information for anomaly detection.
arXiv Detail & Related papers (2020-05-15T09:17:08Z) - 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places,
Objects, and Humans [27.747241700017728]
We present a unified representation for actionable spatial perception: 3D Dynamic Scene Graphs.
3D Dynamic Scene Graphs can have a profound impact on planning and decision-making, human-robot interaction, long-term autonomy, and scene prediction.
arXiv Detail & Related papers (2020-02-15T00:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.