Informative Scene Graph Generation via Debiasing
- URL: http://arxiv.org/abs/2308.05286v2
- Date: Wed, 20 Nov 2024 06:35:55 GMT
- Title: Informative Scene Graph Generation via Debiasing
- Authors: Lianli Gao, Xinyu Lyu, Yuyu Guo, Yuxuan Hu, Yuan-Fang Li, Lu Xu, Heng Tao Shen, Jingkuan Song,
- Abstract summary: Scene graph generation aims to detect visual relationship triplets, (subject, predicate, object)
Due to biases in data, current models tend to predict common predicates.
We propose DB-SGG, an effective framework based on debiasing but not the conventional distribution fitting.
- Score: 124.71164256146342
- License:
- Abstract: Scene graph generation aims to detect visual relationship triplets, (subject, predicate, object). Due to biases in data, current models tend to predict common predicates, e.g. "on" and "at", instead of informative ones, e.g. "standing on" and "looking at". This tendency results in the loss of precise information and overall performance. If a model only uses "stone on road" rather than "stone blocking road" to describe an image, it may be a grave misunderstanding. We argue that this phenomenon is caused by two imbalances: semantic space level imbalance and training sample level imbalance. For this problem, we propose DB-SGG, an effective framework based on debiasing but not the conventional distribution fitting. It integrates two components: Semantic Debiasing (SD) and Balanced Predicate Learning (BPL), for these imbalances. SD utilizes a confusion matrix and a bipartite graph to construct predicate relationships. BPL adopts a random undersampling strategy and an ambiguity removing strategy to focus on informative predicates. Benefiting from the model-agnostic process, our method can be easily applied to SGG models and outperforms Transformer by 136.3%, 119.5%, and 122.6% on mR@20 at three SGG sub-tasks on the SGG-VG dataset. Our method is further verified on another complex SGG dataset (SGG-GQA) and two downstream tasks (sentence-to-graph retrieval and image captioning).
Related papers
- Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction [12.319354506916547]
We propose a novel Sample-Level Bias Prediction (SBP) method for fine-grained Scene Graph Generation (SGG)
Firstly, we train a classic SGG model and construct a correction bias set.
Then, we devise a Bias-Oriented Generative Adversarial Network (BGAN) that learns to predict the constructed correction biases.
arXiv Detail & Related papers (2024-07-27T13:49:06Z) - Improving Scene Graph Generation with Relation Words' Debiasing in Vision-Language Models [6.8754535229258975]
Scene Graph Generation (SGG) provides basic language representation of visual scenes.
Part of test triplets are rare or even unseen during training, resulting in predictions.
We propose using the SGG models with pretrained vision-language models (VLMs) to enhance representation.
arXiv Detail & Related papers (2024-03-24T15:02:24Z) - TD^2-Net: Toward Denoising and Debiasing for Dynamic Scene Graph
Generation [76.24766055944554]
We introduce a network named TD$2$-Net that aims at denoising and debiasing for dynamic SGG.
TD$2$-Net outperforms the second-best competitors by 12.7 % on mean-Recall@10 for predicate classification.
arXiv Detail & Related papers (2024-01-23T04:17:42Z) - Towards Unseen Triples: Effective Text-Image-joint Learning for Scene
Graph Generation [30.79358827005448]
Scene Graph Generation (SGG) aims to structurally and comprehensively represent objects and their connections in images.
Existing SGG models often struggle to solve the long-tailed problem caused by biased datasets.
We propose a Text-Image-joint Scene Graph Generation (TISGG) model to resolve the unseen triples and improve the generalisation capability of the SGG models.
arXiv Detail & Related papers (2023-06-23T10:17:56Z) - Fine-Grained Predicates Learning for Scene Graph Generation [155.48614435437355]
Fine-Grained Predicates Learning aims at differentiating among hard-to-distinguish predicates for Scene Graph Generation task.
We introduce a Predicate Lattice that helps SGG models to figure out fine-grained predicate pairs.
We then propose a Category Discriminating Loss and an Entity Discriminating Loss, which both contribute to distinguishing fine-grained predicates.
arXiv Detail & Related papers (2022-04-06T06:20:09Z) - Fine-Grained Scene Graph Generation with Data Transfer [127.17675443137064]
Scene graph generation (SGG) aims to extract (subject, predicate, object) triplets in images.
Recent works have made a steady progress on SGG, and provide useful tools for high-level vision and language understanding.
We propose a novel Internal and External Data Transfer (IETrans) method, which can be applied in a play-and-plug fashion and expanded to large SGG with 1,807 predicate classes.
arXiv Detail & Related papers (2022-03-22T12:26:56Z) - From General to Specific: Informative Scene Graph Generation via Balance
Adjustment [113.04103371481067]
Current models are stuck in common predicates, e.g., "on" and "at", rather than informative ones.
We propose BA-SGG, a framework based on balance adjustment but not the conventional distribution fitting.
Our method achieves 14.3%, 8.0%, and 6.1% higher Mean Recall (mR) than that of the Transformer model at three scene graph generation sub-tasks on Visual Genome.
arXiv Detail & Related papers (2021-08-30T11:39:43Z) - Unbiased Scene Graph Generation from Biased Training [99.88125954889937]
We present a novel SGG framework based on causal inference but not the conventional likelihood.
We propose to draw the counterfactual causality from the trained graph to infer the effect from the bad bias.
In particular, we use Total Direct Effect (TDE) as the proposed final predicate score for unbiased SGG.
arXiv Detail & Related papers (2020-02-27T07:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.