Peer Learning for Unbiased Scene Graph Generation
- URL: http://arxiv.org/abs/2301.00146v1
- Date: Sat, 31 Dec 2022 07:56:35 GMT
- Title: Peer Learning for Unbiased Scene Graph Generation
- Authors: Liguang Zhou, Junjie Hu, Yuhongze Zhou, Tin Lun Lam, Yangsheng Xu
- Abstract summary: We propose a novel framework dubbed peer learning to deal with the problem of biased scene graph generation (SGG)
This framework uses predicate sampling and consensus voting (PSCV) to encourage different peers to learn from each other.
We have established a new state-of-the-art (SOTA) on the SGCls task by achieving a mean of bf31.6.
- Score: 16.69329808479805
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we propose a novel framework dubbed peer learning to deal with
the problem of biased scene graph generation (SGG). This framework uses
predicate sampling and consensus voting (PSCV) to encourage different peers to
learn from each other, improving model diversity and mitigating bias in SGG. To
address the heavily long-tailed distribution of predicate classes, we propose
to use predicate sampling to divide and conquer this issue. As a result, the
model is less biased and makes more balanced predicate predictions.
Specifically, one peer may not be sufficiently diverse to discriminate between
different levels of predicate distributions. Therefore, we sample the data
distribution based on frequency of predicates into sub-distributions, selecting
head, body, and tail classes to combine and feed to different peers as
complementary predicate knowledge during the training process. The
complementary predicate knowledge of these peers is then ensembled utilizing a
consensus voting strategy, which simulates a civilized voting process in our
society that emphasizes the majority opinion and diminishes the minority
opinion. This approach ensures that the learned representations of each peer
are optimally adapted to the various data distributions. Extensive experiments
on the Visual Genome dataset demonstrate that PSCV outperforms previous
methods. We have established a new state-of-the-art (SOTA) on the SGCls task by
achieving a mean of \textbf{31.6}.
Related papers
- Ensemble Predicate Decoding for Unbiased Scene Graph Generation [40.01591739856469]
Scene Graph Generation (SGG) aims to generate a comprehensive graphical representation that captures semantic information of a given scenario.
The model's performance in predicting more fine-grained predicates is hindered by a significant predicate bias.
This paper proposes Ensemble Predicate Decoding (EPD), which employs multiple decoders to attain unbiased scene graph generation.
arXiv Detail & Related papers (2024-08-26T11:24:13Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Panoptic Scene Graph Generation with Semantics-Prototype Learning [23.759498629378772]
Panoptic Scene Graph Generation (PSG) parses objects and predicts their relationships (predicate) to connect human language and visual scenes.
Different language preferences of annotators and semantic overlaps between predicates lead to biased predicate annotations.
We propose a novel framework named ADTrans to adaptively transfer biased predicate annotations to informative and unified ones.
arXiv Detail & Related papers (2023-07-28T14:04:06Z) - Unbiased Scene Graph Generation using Predicate Similarities [7.9112365100345965]
Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images.
These applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions.
We propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups.
The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks.
arXiv Detail & Related papers (2022-10-03T13:28:01Z) - CAME: Context-aware Mixture-of-Experts for Unbiased Scene Graph
Generation [10.724516317292926]
We present a simple yet effective method called Context-Aware Mixture-of-Experts (CAME) to improve the model diversity and alleviate the biased scene graph generator.
We have conducted extensive experiments on three tasks on the Visual Genome dataset to show that came achieved superior performance over previous methods.
arXiv Detail & Related papers (2022-08-15T10:39:55Z) - Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased
Scene Graph Generation [62.96628432641806]
Scene Graph Generation aims to first encode the visual contents within the given image and then parse them into a compact summary graph.
We first present a novel Stacked Hybrid-Attention network, which facilitates the intra-modal refinement as well as the inter-modal interaction.
We then devise an innovative Group Collaborative Learning strategy to optimize the decoder.
arXiv Detail & Related papers (2022-03-18T09:14:13Z) - Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes.
It causes severe biases of the head classes (with majority samples) against the tailed ones.
We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph
Generation [58.98802062945709]
We propose a novel Predicate-Correlation Perception Learning scheme to adaptively seek out appropriate loss weights.
Our PCPL framework is further equipped with a graph encoder module to better extract context features.
arXiv Detail & Related papers (2020-09-02T08:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.