A Comprehensive Survey of Scene Graphs: Generation and Application
- URL: http://arxiv.org/abs/2104.01111v5
- Date: Fri, 7 Jan 2022 01:35:21 GMT
- Title: A Comprehensive Survey of Scene Graphs: Generation and Application
- Authors: Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen,
and Alex Hauptmann
- Abstract summary: Scene graph is a structured representation of a scene that can clearly express the objects, attributes, and relationships between objects in the scene.
No relatively systematic survey of scene graphs exists at present.
- Score: 42.07469181785126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene graph is a structured representation of a scene that can clearly
express the objects, attributes, and relationships between objects in the
scene. As computer vision technology continues to develop, people are no longer
satisfied with simply detecting and recognizing objects in images; instead,
people look forward to a higher level of understanding and reasoning about
visual scenes. For example, given an image, we want to not only detect and
recognize objects in the image, but also know the relationship between objects
(visual relationship detection), and generate a text description (image
captioning) based on the image content. Alternatively, we might want the
machine to tell us what the little girl in the image is doing (Visual Question
Answering (VQA)), or even remove the dog from the image and find similar images
(image editing and retrieval), etc. These tasks require a higher level of
understanding and reasoning for image vision tasks. The scene graph is just
such a powerful tool for scene understanding. Therefore, scene graphs have
attracted the attention of a large number of researchers, and related research
is often cross-modal, complex, and rapidly developing. However, no relatively
systematic survey of scene graphs exists at present. To this end, this survey
conducts a comprehensive investigation of the current scene graph research.
More specifically, we first summarized the general definition of the scene
graph, then conducted a comprehensive and systematic discussion on the
generation method of the scene graph (SGG) and the SGG with the aid of prior
knowledge. We then investigated the main applications of scene graphs and
summarized the most commonly used datasets. Finally, we provide some insights
into the future development of scene graphs. We believe this will be a very
helpful foundation for future research on scene graphs.
Related papers
- Symbolic image detection using scene and knowledge graphs [39.49756199669471]
We use a scene graph, a graph representation of an image, to capture visual components.
We generate a knowledge graph using facts extracted from ConceptNet to reason about objects and attributes.
We extend the network further to use an attention mechanism which learn the importance of the graph on representations.
arXiv Detail & Related papers (2022-06-10T04:06:28Z) - SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense
Reasoning [61.57887011165744]
multimodal Transformers have made great progress in the task of Visual Commonsense Reasoning.
We propose a Scene Graph Enhanced Image-Text Learning framework to incorporate visual scene graphs in commonsense reasoning.
arXiv Detail & Related papers (2021-12-16T03:16:30Z) - Unconditional Scene Graph Generation [72.53624470737712]
We develop a deep auto-regressive model called SceneGraphGen which can learn the probability distribution over labelled and directed graphs.
We show that the scene graphs generated by SceneGraphGen are diverse and follow the semantic patterns of real-world scenes.
arXiv Detail & Related papers (2021-08-12T17:57:16Z) - Image Scene Graph Generation (SGG) Benchmark [58.33119409657256]
There is a surge of interest in image scene graph generation (object, and relationship detection)
Due to the lack of a good benchmark, the reported results of different scene graph generation models are not directly comparable.
We have developed a much-needed scene graph generation benchmark based on the maskrcnn-benchmark and several popular models.
arXiv Detail & Related papers (2021-07-27T05:10:09Z) - Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question
Answering [13.886692497676659]
Graphhopper is a novel method that approaches the task by integrating knowledge graph reasoning, computer vision, and natural language processing techniques.
We derive a scene graph that describes the objects in the image, as well as their attributes and their mutual relationships.
A reinforcement learning agent is trained to autonomously navigate in a multi-hop manner over the extracted scene graph to generate reasoning paths.
arXiv Detail & Related papers (2021-07-13T18:33:04Z) - Understanding the Role of Scene Graphs in Visual Question Answering [26.02889386248289]
We conduct experiments on the GQA dataset which presents a challenging set of questions requiring counting, compositionality and advanced reasoning capability.
We adopt image + question architectures for use with scene graphs, evaluate various scene graph generation techniques for unseen images, propose a training curriculum to leverage human-annotated and auto-generated scene graphs.
We present a multi-faceted study into the use of scene graphs for Visual Question Answering, making this work the first of its kind.
arXiv Detail & Related papers (2021-01-14T07:27:37Z) - Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation [98.34909905511061]
We argue that a desirable scene graph should be hierarchically constructed, and introduce a new scheme for modeling scene graph.
To generate a scene graph based on HET, we parse HET with a Hybrid Long Short-Term Memory (Hybrid-LSTM) which specifically encodes hierarchy and siblings context.
To further prioritize key relations in the scene graph, we devise a Relation Ranking Module (RRM) to dynamically adjust their rankings.
arXiv Detail & Related papers (2020-07-17T05:12:13Z) - Scene Graph Reasoning for Visual Question Answering [23.57543808056452]
We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene.
A reinforcement agent then learns to autonomously navigate over the extracted scene graph to generate paths, which are then the basis for deriving answers.
arXiv Detail & Related papers (2020-07-02T13:02:54Z) - Visual Relationship Detection using Scene Graphs: A Survey [1.3505077405741583]
A Scene Graph is a technique to better represent a scene and the various relationships present in it.
We present a detailed survey on the various techniques for scene graph generation, their efficacy to represent visual relationships and how it has been used to solve various downstream tasks.
arXiv Detail & Related papers (2020-05-16T17:06:06Z) - Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions [94.17683799712397]
We focus on scene graphs, a data structure that organizes the entities of a scene in a graph.
We propose a learned method that regresses a scene graph from the point cloud of a scene.
We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
arXiv Detail & Related papers (2020-04-08T12:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.