Related papers: SGTR+: End-to-end Scene Graph Generation with Transformer

SGTR+: End-to-end Scene Graph Generation with Transformer

URL: http://arxiv.org/abs/2401.12835v1
Date: Tue, 23 Jan 2024 15:18:20 GMT
Title: SGTR+: End-to-end Scene Graph Generation with Transformer
Authors: Rongjie Li, Songyang Zhang, Xuming He
Abstract summary: Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. We propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem.
Score: 42.396971149458324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. In this work, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To address the issues above, we create a transformer-based end-to-end framework to generate the entity and entity-aware predicate proposal set, and infer directed edges to form relation triplets. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Based on bipartite graph assembling paradigm, we further propose a new technical design to address the efficacy of entity-aware modeling and optimization stability of graph assembling. Equipped with the enhanced entity-aware design, our method achieves optimal performance and time-complexity. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on three challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference. Code is available: https://github.com/Scarecrow0/SGTR

Related papers

VectorGraphNET: Graph Attention Networks for Accurate Segmentation of Complex Technical Drawings [0.40964539027092917]
This paper introduces a new approach to extract and analyze vector data from technical drawings in PDF format. Our method involves converting PDF files into SVG format and creating a feature-rich graph representation. We then apply a graph attention transformer with hierarchical label definition to achieve accurate line-level segmentation.
arXiv Detail & Related papers (2024-10-02T08:53:20Z)
A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks. GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z)
A Structure-Aware Framework for Learning Device Placements on Computation Graphs [15.282882425920064]
We propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. To train the entire framework, we use reinforcement learning using the execution time of the placement as a reward.
arXiv Detail & Related papers (2024-05-23T05:29:29Z)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR) Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z)
Graph Transformer GANs with Graph Masked Modeling for Architectural Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations. The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions. We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z)
Explore Contextual Information for 3D Scene Graph Generation [43.66442227874461]
3D scene graph generation (SGG) has been of high interest in computer vision. We propose a framework fully exploring contextual information for the 3D SGG task. Our approach achieves superior or competitive performance over previous methods on the 3DSSG dataset.
arXiv Detail & Related papers (2022-10-12T14:26:17Z)
Iterative Scene Graph Generation [55.893695946885174]
Scene graph generation involves identifying object entities and their corresponding interaction predicates in a given image (or video) Existing approaches to scene graph generation assume certain factorization of the joint distribution to make the estimation iteration feasible. We propose a novel framework that addresses this limitation, as well as introduces dynamic conditioning on the image.
arXiv Detail & Related papers (2022-07-27T10:37:29Z)
SGTR: End-to-end Scene Graph Generation with Transformer [41.606381084893194]
Scene Graph Generation (SGG) remains a challenging visual understanding task due to its complex compositional property. We propose a novel SGG method to address the aforementioned issues, which formulates the task as a bipartite graph construction problem.
arXiv Detail & Related papers (2021-12-24T07:10:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.