Hyper-relationship Learning Network for Scene Graph Generation
- URL: http://arxiv.org/abs/2202.07271v1
- Date: Tue, 15 Feb 2022 09:26:16 GMT
- Title: Hyper-relationship Learning Network for Scene Graph Generation
- Authors: Yibing Zhan, Zhi Chen, Jun Yu, BaoSheng Yu, Dacheng Tao, Yong Luo
- Abstract summary: We propose a hyper-relationship learning network, termed HLN, for scene graph generation.
We evaluate HLN on the most popular SGG dataset, i.e., the Visual Genome dataset.
For example, the proposed HLN improves the recall per relationship from 11.3% to 13.1%, and maintains the recall per image from 19.8% to 34.9%.
- Score: 95.6796681398668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating informative scene graphs from images requires integrating and
reasoning from various graph components, i.e., objects and relationships.
However, current scene graph generation (SGG) methods, including the unbiased
SGG methods, still struggle to predict informative relationships due to the
lack of 1) high-level inference such as transitive inference between
relationships and 2) efficient mechanisms that can incorporate all interactions
of graph components. To address the issues mentioned above, we devise a
hyper-relationship learning network, termed HLN, for SGG. Specifically, the
proposed HLN stems from hypergraphs and two graph attention networks (GATs) are
designed to infer relationships: 1) the object-relationship GAT or OR-GAT to
explore interactions between objects and relationships, and 2) the
hyper-relationship GAT or HR-GAT to integrate transitive inference of
hyper-relationships, i.e., the sequential relationships between three objects
for transitive reasoning. As a result, HLN significantly improves the
performance of scene graph generation by integrating and reasoning from object
interactions, relationship interactions, and transitive inference of
hyper-relationships. We evaluate HLN on the most popular SGG dataset, i.e., the
Visual Genome dataset, and the experimental results demonstrate its great
superiority over recent state-of-the-art methods. For example, the proposed HLN
improves the recall per relationship from 11.3\% to 13.1\%, and maintains the
recall per image from 19.8\% to 34.9\%. We will release the source code and
pretrained models on GitHub.
Related papers
- Unbiased Scene Graph Generation by Type-Aware Message Passing on Heterogeneous and Dual Graphs [1.0609815608017066]
An unbiased scene graph generation (TA-HDG) is proposed to address these issues.
For modeling interactive and non-interactive relations, the Interactive Graph Construction is proposed.
The Type-Aware Message Passing enhances the understanding of complex interactions.
arXiv Detail & Related papers (2024-11-20T12:54:47Z) - Semantic Scene Graph Generation Based on an Edge Dual Scene Graph and
Message Passing Neural Network [3.9280441311534653]
Scene graph generation (SGG) captures the relationships between objects in an image and creates a structured graph-based representation.
Existing SGG methods have a limited ability to accurately predict detailed relationships.
A new approach to the modeling multiobject relationships, called edge dual scene graph generation (EdgeSGG), is proposed herein.
arXiv Detail & Related papers (2023-11-02T12:36:52Z) - Unbiased Heterogeneous Scene Graph Generation with Relation-aware
Message Passing Neural Network [9.779600950401315]
We propose an unbiased heterogeneous scene graph generation (HetSGG) framework that captures relation-aware context.
We devise a novel message passing layer, called relation-aware message passing neural network (RMP), that aggregates the contextual information of an image.
arXiv Detail & Related papers (2022-12-01T11:25:36Z) - HL-Net: Heterophily Learning Network for Scene Graph Generation [90.2766568914452]
We propose a novel Heterophily Learning Network (HL-Net) to explore the homophily and heterophily between objects/relationships in scene graphs.
HL-Net comprises the following 1) an adaptive reweighting transformer module, which adaptively integrates the information from different layers to exploit both the heterophily and homophily in objects.
We conducted extensive experiments on two public datasets: Visual Genome (VG) and Open Images (OI)
arXiv Detail & Related papers (2022-05-03T06:00:29Z) - Relation Regularized Scene Graph Generation [206.76762860019065]
Scene graph generation (SGG) is built on top of detected objects to predict object pairwise visual relations.
We propose a relation regularized network (R2-Net) which can predict whether there is a relationship between two objects.
Our R2-Net can effectively refine object labels and generate scene graphs.
arXiv Detail & Related papers (2022-02-22T11:36:49Z) - Not All Relations are Equal: Mining Informative Labels for Scene Graph
Generation [48.21846438269506]
Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects.
Existing SGG methods fail to acquire complex reasoning about visual and textual correlations due to various biases in training data.
We propose a novel framework for SGG training that exploits relation labels based on their informativeness.
arXiv Detail & Related papers (2021-11-26T14:34:12Z) - ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
Interaction Detection [101.56529337489417]
We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of human, action, object> in images.
We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs.
Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities.
arXiv Detail & Related papers (2020-08-14T09:11:18Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.