SGDraw: Scene Graph Drawing Interface Using Object-Oriented
Representation
- URL: http://arxiv.org/abs/2211.16697v2
- Date: Fri, 16 Jun 2023 09:02:16 GMT
- Title: SGDraw: Scene Graph Drawing Interface Using Object-Oriented
Representation
- Authors: Tianyu Zhang, Xusheng Du, Chia-Ming Chang, Xi Yang, Haoran Xie
- Abstract summary: We propose SGDraw, a scene graph drawing interface using object-oriented scene graph representation.
We show that SGDraw can help generate scene graphs with richer details and describe the images more accurately than traditional bounding box annotations.
- Score: 18.109884282338356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene understanding is an essential and challenging task in computer vision.
To provide the visually fundamental graphical structure of an image, the scene
graph has received increased attention due to its powerful semantic
representation. However, it is difficult to draw a proper scene graph for image
retrieval, image generation, and multi-modal applications. The conventional
scene graph annotation interface is not easy to use in image annotations, and
the automatic scene graph generation approaches using deep neural networks are
prone to generate redundant content while disregarding details. In this work,
we propose SGDraw, a scene graph drawing interface using object-oriented scene
graph representation to help users draw and edit scene graphs interactively.
For the proposed object-oriented representation, we consider the objects,
attributes, and relationships of objects as a structural unit. SGDraw provides
a web-based scene graph annotation and generation tool for scene understanding
applications. To verify the effectiveness of the proposed interface, we
conducted a comparison study with the conventional tool and the user experience
study. The results show that SGDraw can help generate scene graphs with richer
details and describe the images more accurately than traditional bounding box
annotations. We believe the proposed SGDraw can be useful in various vision
tasks, such as image retrieval and generation.
Related papers
- SelfGraphVQA: A Self-Supervised Graph Neural Network for Scene-based
Question Answering [0.0]
Scene graphs have emerged as a useful tool for multimodal image analysis.
Current methods that utilize idealized annotated scene graphs struggle to generalize when using predicted scene graphs extracted from images.
Our approach extracts a scene graph from an input image using a pre-trained scene graph generator.
arXiv Detail & Related papers (2023-10-03T07:14:53Z) - Diffusion-Based Scene Graph to Image Generation with Masked Contrastive
Pre-Training [112.94542676251133]
We propose to learn scene graph embeddings by directly optimizing their alignment with images.
Specifically, we pre-train an encoder to extract both global and local information from scene graphs.
The resulting method, called SGDiff, allows for the semantic manipulation of generated images by modifying scene graph nodes and connections.
arXiv Detail & Related papers (2022-11-21T01:11:19Z) - Image Semantic Relation Generation [0.76146285961466]
Scene graphs can distil complex image information and correct the bias of visual models using semantic-level relations.
In this work, we introduce image semantic relation generation (ISRG), a simple but effective image-to-text model.
arXiv Detail & Related papers (2022-10-19T16:15:19Z) - Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query.
We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE)
We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z) - Symbolic image detection using scene and knowledge graphs [39.49756199669471]
We use a scene graph, a graph representation of an image, to capture visual components.
We generate a knowledge graph using facts extracted from ConceptNet to reason about objects and attributes.
We extend the network further to use an attention mechanism which learn the importance of the graph on representations.
arXiv Detail & Related papers (2022-06-10T04:06:28Z) - Scene Graph Expansion for Semantics-Guided Image Outpainting [27.249757777855176]
We propose a novel network of Scene Graph Transformer (SGT), which is designed to take node and edge features as inputs for modeling the associated structural information.
To better understand and process graph-based inputs, our SGT uniquely performs feature attention at both node and edge levels.
We demonstrate that, given a partial input image with its layout and scene graph, our SGT can be applied for scene graph expansion and its conversion to a complete layout.
arXiv Detail & Related papers (2022-05-05T23:13:43Z) - SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense
Reasoning [61.57887011165744]
multimodal Transformers have made great progress in the task of Visual Commonsense Reasoning.
We propose a Scene Graph Enhanced Image-Text Learning framework to incorporate visual scene graphs in commonsense reasoning.
arXiv Detail & Related papers (2021-12-16T03:16:30Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z) - Unconditional Scene Graph Generation [72.53624470737712]
We develop a deep auto-regressive model called SceneGraphGen which can learn the probability distribution over labelled and directed graphs.
We show that the scene graphs generated by SceneGraphGen are diverse and follow the semantic patterns of real-world scenes.
arXiv Detail & Related papers (2021-08-12T17:57:16Z) - Learning Physical Graph Representations from Visual Scenes [56.7938395379406]
Physical Scene Graphs (PSGs) represent scenes as hierarchical graphs with nodes corresponding intuitively to object parts at different scales, and edges to physical connections between parts.
PSGNet augments standard CNNs by including: recurrent feedback connections to combine low and high-level image information; graph pooling and vectorization operations that convert spatially-uniform feature maps into object-centric graph structures.
We show that PSGNet outperforms alternative self-supervised scene representation algorithms at scene segmentation tasks.
arXiv Detail & Related papers (2020-06-22T16:10:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.