Image Scene Graph Generation (SGG) Benchmark
- URL: http://arxiv.org/abs/2107.12604v1
- Date: Tue, 27 Jul 2021 05:10:09 GMT
- Title: Image Scene Graph Generation (SGG) Benchmark
- Authors: Xiaotian Han, Jianwei Yang, Houdong Hu, Lei Zhang, Jianfeng Gao,
Pengchuan Zhang
- Abstract summary: There is a surge of interest in image scene graph generation (object, and relationship detection)
Due to the lack of a good benchmark, the reported results of different scene graph generation models are not directly comparable.
We have developed a much-needed scene graph generation benchmark based on the maskrcnn-benchmark and several popular models.
- Score: 58.33119409657256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a surge of interest in image scene graph generation (object,
attribute and relationship detection) due to the need of building fine-grained
image understanding models that go beyond object detection. Due to the lack of
a good benchmark, the reported results of different scene graph generation
models are not directly comparable, impeding the research progress. We have
developed a much-needed scene graph generation benchmark based on the
maskrcnn-benchmark and several popular models. This paper presents main
features of our benchmark and a comprehensive ablation study of scene graph
generation models using the Visual Genome and OpenImages Visual relationship
detection datasets. Our codebase is made publicly available at
https://github.com/microsoft/scene_graph_benchmark.
Related papers
- FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph
Parsing [66.70054075041487]
Existing scene graphs that convert image captions into scene graphs often suffer from two types of errors.
First, the generated scene graphs fail to capture the true semantics of the captions or the corresponding images, resulting in a lack of faithfulness.
Second, the generated scene graphs have high inconsistency, with the same semantics represented by different annotations.
arXiv Detail & Related papers (2023-05-27T15:38:31Z) - SPAN: Learning Similarity between Scene Graphs and Images with Transformers [29.582313604112336]
We propose a Scene graPh-imAge coNtrastive learning framework, SPAN, that can measure the similarity between scene graphs and images.
We introduce a novel graph serialization technique that transforms a scene graph into a sequence with structural encodings.
arXiv Detail & Related papers (2023-04-02T18:13:36Z) - MIGS: Meta Image Generation from Scene Graphs [48.82382997154196]
We propose MIGS (Meta Image Generation from Scene Graphs), a meta-learning based approach for few-shot image generation from graphs.
By sampling the data in a task-driven fashion, we train the generator using meta-learning on different sets of tasks that are categorized based on the scene attributes.
Our results show that using this meta-learning approach for the generation of images from scene graphs state-of-the-art performance in terms of image quality and capturing the semantic relationships in the scene.
arXiv Detail & Related papers (2021-10-22T17:02:44Z) - Scene Graph Generation for Better Image Captioning? [48.411957217304]
We propose a model that leverages detected objects and auto-generated visual relationships to describe images in natural language.
We generate a scene graph from raw image pixels by identifying individual objects and visual relationships between them.
This scene graph then serves as input to our graph-to-text model, which generates the final caption.
arXiv Detail & Related papers (2021-09-23T14:35:11Z) - Learning to Generate Scene Graph from Natural Language Supervision [52.18175340725455]
We propose one of the first methods that learn from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as scene graph.
We leverage an off-the-shelf object detector to identify and localize object instances, match labels of detected regions to concepts parsed from captions, and thus create "pseudo" labels for learning scene graph.
arXiv Detail & Related papers (2021-09-06T03:38:52Z) - Unconditional Scene Graph Generation [72.53624470737712]
We develop a deep auto-regressive model called SceneGraphGen which can learn the probability distribution over labelled and directed graphs.
We show that the scene graphs generated by SceneGraphGen are diverse and follow the semantic patterns of real-world scenes.
arXiv Detail & Related papers (2021-08-12T17:57:16Z) - Are scene graphs good enough to improve Image Captioning? [19.36188161855731]
We investigate the use of scene graphs in image captioning.
We find no significant difference between models that use scene graph features and models that only use object detection features.
Although the quality of predicted scene graphs is very low in general, when using high quality scene graphs we obtain gains of up to 3.3 CIDEr.
arXiv Detail & Related papers (2020-09-25T16:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.