Dual ResGCN for Balanced Scene GraphGeneration
- URL: http://arxiv.org/abs/2011.04234v1
- Date: Mon, 9 Nov 2020 07:44:17 GMT
- Title: Dual ResGCN for Balanced Scene GraphGeneration
- Authors: Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yanbo Fan, Fumin Shen and Heng
Tao Shen
- Abstract summary: We propose a novel model, dubbed textitdual ResGCN, which consists of an object residual graph convolutional network and a relation residual graph convolutional network.
The two networks are complementary to each other. The former captures object-level context information, textiti.e., the connections among objects.
The latter is carefully designed to explicitly capture relation-level context information textiti.e., the connections among relations.
- Score: 106.7828712878278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual scene graph generation is a challenging task. Previous works have
achieved great progress, but most of them do not explicitly consider the class
imbalance issue in scene graph generation. Models learned without considering
the class imbalance tend to predict the majority classes, which leads to a good
performance on trivial frequent predicates, but poor performance on informative
infrequent predicates. However, predicates of minority classes often carry more
semantic and precise information~(\textit{e.g.}, \emph{`on'} v.s \emph{`parked
on'}). % which leads to a good score of recall, but a poor score of mean
recall. To alleviate the influence of the class imbalance, we propose a novel
model, dubbed \textit{dual ResGCN}, which consists of an object residual graph
convolutional network and a relation residual graph convolutional network. The
two networks are complementary to each other. The former captures object-level
context information, \textit{i.e.,} the connections among objects. We propose a
novel ResGCN that enhances object features in a cross attention manner.
Besides, we stack multiple contextual coefficients to alleviate the imbalance
issue and enrich the prediction diversity. The latter is carefully designed to
explicitly capture relation-level context information \textit{i.e.,} the
connections among relations. We propose to incorporate the prior about the
co-occurrence of relation pairs into the graph to further help alleviate the
class imbalance issue. Extensive evaluations of three tasks are performed on
the large-scale database VG to demonstrate the superiority of the proposed
method.
Related papers
- Redundancy-Free Self-Supervised Relational Learning for Graph Clustering [13.176413653235311]
We propose a novel self-supervised deep graph clustering method named Redundancy-Free Graph Clustering (R$2$FGC)
It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder.
Our experiments are performed on widely used benchmark datasets to validate the superiority of our R$2$FGC over state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-09T06:18:50Z) - Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph
Generation [55.429541407920304]
Recognizing the predicate between subject and object pairs is imbalanced and multi-label in nature.
Recent state-of-the-art methods predominantly focus on the most frequently occurring predicate classes.
We introduce a multi-label meta-learning framework to deal with the biased predicate distribution.
arXiv Detail & Related papers (2023-06-16T18:14:23Z) - Learnable Graph Matching: A Practical Paradigm for Data Association [74.28753343714858]
We propose a general learnable graph matching method to address these issues.
Our method achieves state-of-the-art performance on several MOT datasets.
For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet.
arXiv Detail & Related papers (2023-03-27T17:39:00Z) - Unbiased Heterogeneous Scene Graph Generation with Relation-aware
Message Passing Neural Network [9.779600950401315]
We propose an unbiased heterogeneous scene graph generation (HetSGG) framework that captures relation-aware context.
We devise a novel message passing layer, called relation-aware message passing neural network (RMP), that aggregates the contextual information of an image.
arXiv Detail & Related papers (2022-12-01T11:25:36Z) - Explanation Graph Generation via Pre-trained Language Models: An
Empirical Study with Contrastive Learning [84.35102534158621]
We study pre-trained language models that generate explanation graphs in an end-to-end manner.
We propose simple yet effective ways of graph perturbations via node and edge edit operations.
Our methods lead to significant improvements in both structural and semantic accuracy of explanation graphs.
arXiv Detail & Related papers (2022-04-11T00:58:27Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Semi-Supervised Graph-to-Graph Translation [31.47555366566109]
Graph translation is a promising research direction and has a wide range of potential real-world applications.
One important reason is the lack of high-quality paired dataset.
We propose to construct a dual representation space, where transformation is performed explicitly to model the semantic transitions.
arXiv Detail & Related papers (2021-03-16T03:24:20Z) - Generative Compositional Augmentations for Scene Graph Prediction [27.535630110794855]
Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of vision and language.
We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution.
We propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs.
arXiv Detail & Related papers (2020-07-11T12:11:53Z) - Graph Density-Aware Losses for Novel Compositions in Scene Graph
Generation [27.535630110794855]
Scene graph generation aims to predict graph-structured descriptions of input images.
It is important - yet challenging - to perform well on novel (zero-shot) or rare (few-shot) compositions of objects and relationships.
We show that the standard loss used in this task is unintentionally a function of scene graph density.
We introduce a density-normalized edge loss, which provides more than a two-fold improvement in certain generalization metrics.
arXiv Detail & Related papers (2020-05-17T11:45:29Z) - Iterative Context-Aware Graph Inference for Visual Dialog [126.016187323249]
We propose a novel Context-Aware Graph (CAG) neural network.
Each node in the graph corresponds to a joint semantic feature, including both object-based (visual) and history-related (textual) context representations.
arXiv Detail & Related papers (2020-04-05T13:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.