Iterative Scene Graph Generation
- URL: http://arxiv.org/abs/2207.13440v1
- Date: Wed, 27 Jul 2022 10:37:29 GMT
- Title: Iterative Scene Graph Generation
- Authors: Siddhesh Khandelwal and Leonid Sigal
- Abstract summary: Scene graph generation involves identifying object entities and their corresponding interaction predicates in a given image (or video)
Existing approaches to scene graph generation assume certain factorization of the joint distribution to make the estimation iteration feasible.
We propose a novel framework that addresses this limitation, as well as introduces dynamic conditioning on the image.
- Score: 55.893695946885174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of scene graph generation entails identifying object entities and
their corresponding interaction predicates in a given image (or video). Due to
the combinatorially large solution space, existing approaches to scene graph
generation assume certain factorization of the joint distribution to make the
estimation feasible (e.g., assuming that objects are conditionally independent
of predicate predictions). However, this fixed factorization is not ideal under
all scenarios (e.g., for images where an object entailed in interaction is
small and not discernible on its own). In this work, we propose a novel
framework for scene graph generation that addresses this limitation, as well as
introduces dynamic conditioning on the image, using message passing in a Markov
Random Field. This is implemented as an iterative refinement procedure wherein
each modification is conditioned on the graph generated in the previous
iteration. This conditioning across refinement steps allows joint reasoning
over entities and relations. This framework is realized via a novel and
end-to-end trainable transformer-based architecture. In addition, the proposed
framework can improve existing approach performance. Through extensive
experiments on Visual Genome and Action Genome benchmark datasets we show
improved performance on the scene graph generation.
Related papers
- Self-Supervised Relation Alignment for Scene Graph Generation [44.3983804479146]
We introduce a self-supervised relational alignment regularization to improve scene graph generation performance.
The proposed alignment is general and can be combined with any existing scene graph generation framework.
We illustrate the effectiveness of this self-supervised relational alignment in conjunction with two scene graph generation architectures.
arXiv Detail & Related papers (2023-02-02T20:34:13Z) - Iterative Scene Graph Generation with Generative Transformers [6.243995448840211]
Scene graphs provide a rich, structured representation of a scene by encoding the entities (objects) and their spatial relationships in a graphical format.
Current approaches take a generation-by-classification approach where the scene graph is generated through labeling of all possible edges between objects in a scene.
This work introduces a generative transformer-based approach to generating scene graphs beyond link prediction.
arXiv Detail & Related papers (2022-11-30T00:05:44Z) - DisPositioNet: Disentangled Pose and Identity in Semantic Image
Manipulation [83.51882381294357]
DisPositioNet is a model that learns a disentangled representation for each object for the task of image manipulation using scene graphs.
Our framework enables the disentanglement of the variational latent embeddings as well as the feature representation in the graph.
arXiv Detail & Related papers (2022-11-10T11:47:37Z) - Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query.
We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE)
We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z) - SGTR: End-to-end Scene Graph Generation with Transformer [41.606381084893194]
Scene Graph Generation (SGG) remains a challenging visual understanding task due to its complex compositional property.
We propose a novel SGG method to address the aforementioned issues, which formulates the task as a bipartite graph construction problem.
arXiv Detail & Related papers (2021-12-24T07:10:18Z) - Unconditional Scene Graph Generation [72.53624470737712]
We develop a deep auto-regressive model called SceneGraphGen which can learn the probability distribution over labelled and directed graphs.
We show that the scene graphs generated by SceneGraphGen are diverse and follow the semantic patterns of real-world scenes.
arXiv Detail & Related papers (2021-08-12T17:57:16Z) - Segmentation-grounded Scene Graph Generation [47.34166260639392]
We propose a framework for pixel-level segmentation-grounded scene graph generation.
Our framework is agnostic to the underlying scene graph generation method.
It is learned in a multi-task manner with both target and auxiliary datasets.
arXiv Detail & Related papers (2021-04-29T08:54:08Z) - A Graph-based Interactive Reasoning for Human-Object Interaction
Detection [71.50535113279551]
We present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs.
We construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet.
Our framework is end-to-end trainable and free from costly annotations like human pose.
arXiv Detail & Related papers (2020-07-14T09:29:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.