Addressing Class Imbalance in Scene Graph Parsing by Learning to
Contrast and Score
- URL: http://arxiv.org/abs/2009.13331v2
- Date: Mon, 5 Oct 2020 13:06:45 GMT
- Title: Addressing Class Imbalance in Scene Graph Parsing by Learning to
Contrast and Score
- Authors: He Huang, Shunta Saito, Yuta Kikuchi, Eiichi Matsumoto, Wei Tang,
Philip S. Yu
- Abstract summary: Scene graph parsing aims to detect objects in an image scene and recognize their relations.
Recent approaches have achieved high average scores on some popular benchmarks, but fail in detecting rare relations.
This paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem.
- Score: 65.18522219013786
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene graph parsing aims to detect objects in an image scene and recognize
their relations. Recent approaches have achieved high average scores on some
popular benchmarks, but fail in detecting rare relations, as the highly
long-tailed distribution of data biases the learning towards frequent labels.
Motivated by the fact that detecting these rare relations can be critical in
real-world applications, this paper introduces a novel integrated framework of
classification and ranking to resolve the class imbalance problem in scene
graph parsing. Specifically, we design a new Contrasting Cross-Entropy loss,
which promotes the detection of rare relations by suppressing incorrect
frequent ones. Furthermore, we propose a novel scoring module, termed as
Scorer, which learns to rank the relations based on the image features and
relation features to improve the recall of predictions. Our framework is simple
and effective, and can be incorporated into current scene graph models.
Experimental results show that the proposed approach improves the current
state-of-the-art methods, with a clear advantage of detecting rare relations.
Related papers
- Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views.
We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z) - Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching.
The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k)
It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z) - Instance-Level Relative Saliency Ranking with Graph Reasoning [126.09138829920627]
We present a novel unified model to segment salient instances and infer relative saliency rank order.
A novel loss function is also proposed to effectively train the saliency ranking branch.
experimental results demonstrate that our proposed model is more effective than previous methods.
arXiv Detail & Related papers (2021-07-08T13:10:42Z) - Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning.
We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels.
Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z) - One-shot Learning for Temporal Knowledge Graphs [49.41854171118697]
We propose a one-shot learning framework for link prediction in temporal knowledge graphs.
Our proposed method employs a self-attention mechanism to effectively encode temporal interactions between entities.
Our experiments show that the proposed algorithm outperforms the state of the art baselines for two well-studied benchmarks.
arXiv Detail & Related papers (2020-10-23T03:24:44Z) - Generative Compositional Augmentations for Scene Graph Prediction [27.535630110794855]
Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of vision and language.
We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution.
We propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs.
arXiv Detail & Related papers (2020-07-11T12:11:53Z) - Explanation-based Weakly-supervised Learning of Visual Relations with
Graph Networks [7.199745314783952]
This paper introduces a novel weakly-supervised method for visual relationship detection that relies on minimal image-level predicate labels.
A graph neural network is trained to classify predicates in images from a graph representation of detected objects, implicitly encoding an inductive bias for pairwise relations.
We present results comparable to recent fully- and weakly-supervised methods on three diverse and challenging datasets.
arXiv Detail & Related papers (2020-06-16T23:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.