Tackling the Unannotated: Scene Graph Generation with Bias-Reduced
Models
- URL: http://arxiv.org/abs/2008.07832v1
- Date: Tue, 18 Aug 2020 10:04:51 GMT
- Title: Tackling the Unannotated: Scene Graph Generation with Bias-Reduced
Models
- Authors: Tzu-Jui Julius Wang, Selen Pehlivan, Jorma Laaksonen
- Abstract summary: State-of-the-art results are still far from satisfactory, e.g. models can obtain 31% in overall recall R@100.
We propose a novel SGG training scheme that capitalizes on self-learned knowledge.
- Score: 8.904910414410855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predicting a scene graph that captures visual entities and their interactions
in an image has been considered a crucial step towards full scene
comprehension. Recent scene graph generation (SGG) models have shown their
capability of capturing the most frequent relations among visual entities.
However, the state-of-the-art results are still far from satisfactory, e.g.
models can obtain 31% in overall recall R@100, whereas the likewise important
mean class-wise recall mR@100 is only around 8% on Visual Genome (VG). The
discrepancy between R and mR results urges to shift the focus from pursuing a
high R to a high mR with a still competitive R. We suspect that the observed
discrepancy stems from both the annotation bias and sparse annotations in VG,
in which many visual entity pairs are either not annotated at all or only with
a single relation when multiple ones could be valid. To address this particular
issue, we propose a novel SGG training scheme that capitalizes on self-learned
knowledge. It involves two relation classifiers, one offering a less biased
setting for the other to base on. The proposed scheme can be applied to most of
the existing SGG models and is straightforward to implement. We observe
significant relative improvements in mR (between +6.6% and +20.4%) and
competitive or better R (between -2.4% and 0.3%) across all standard SGG tasks.
Related papers
- Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation [57.69385990442078]
Hydra-SGG achieves state-of-the-art performance with 10.6 mR@20 and 16.0 mR@50 on VG150, while only requiring 12 training epochs.
It also sets a new state-of-the-art on Open Images V6 and and GQA.
arXiv Detail & Related papers (2024-09-16T13:13:06Z) - Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction [12.319354506916547]
We propose a novel Sample-Level Bias Prediction (SBP) method for fine-grained Scene Graph Generation (SGG)
Firstly, we train a classic SGG model and construct a correction bias set.
Then, we devise a Bias-Oriented Generative Adversarial Network (BGAN) that learns to predict the constructed correction biases.
arXiv Detail & Related papers (2024-07-27T13:49:06Z) - Identity-Seeking Self-Supervised Representation Learning for
Generalizable Person Re-identification [55.1738496692892]
Prior DG ReID methods employ limited labeled data for training due to the high cost of annotation.
We propose an Identity-seeking Self-supervised Representation learning (ISR) method.
ISR constructs positive pairs from inter-frame images by modeling the instance association as a maximum-weight bipartite matching problem.
ISR achieves 87.0% Rank-1 on Market-1501 and 56.4% Rank-1 on MSMT17, outperforming the best supervised domain-generalizable method by 5.0% and 19.5%, respectively.
arXiv Detail & Related papers (2023-08-17T09:46:27Z) - Towards Unseen Triples: Effective Text-Image-joint Learning for Scene
Graph Generation [30.79358827005448]
Scene Graph Generation (SGG) aims to structurally and comprehensively represent objects and their connections in images.
Existing SGG models often struggle to solve the long-tailed problem caused by biased datasets.
We propose a Text-Image-joint Scene Graph Generation (TISGG) model to resolve the unseen triples and improve the generalisation capability of the SGG models.
arXiv Detail & Related papers (2023-06-23T10:17:56Z) - Rethinking the Evaluation of Unbiased Scene Graph Generation [31.041074897404236]
Scene Graph Generation (SGG) methods tend to predict frequent predicate categories and fail to recognize rare ones.
Recent research has focused on unbiased SGG and adopted mean Recall@K as the main evaluation metric.
We propose two complementary evaluation metrics for unbiased SGG: Independent Mean Recall (IMR) and weighted IMR (wIMR)
arXiv Detail & Related papers (2022-08-03T08:23:51Z) - Hyper-relationship Learning Network for Scene Graph Generation [95.6796681398668]
We propose a hyper-relationship learning network, termed HLN, for scene graph generation.
We evaluate HLN on the most popular SGG dataset, i.e., the Visual Genome dataset.
For example, the proposed HLN improves the recall per relationship from 11.3% to 13.1%, and maintains the recall per image from 19.8% to 34.9%.
arXiv Detail & Related papers (2022-02-15T09:26:16Z) - Not All Relations are Equal: Mining Informative Labels for Scene Graph
Generation [48.21846438269506]
Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects.
Existing SGG methods fail to acquire complex reasoning about visual and textual correlations due to various biases in training data.
We propose a novel framework for SGG training that exploits relation labels based on their informativeness.
arXiv Detail & Related papers (2021-11-26T14:34:12Z) - From General to Specific: Informative Scene Graph Generation via Balance
Adjustment [113.04103371481067]
Current models are stuck in common predicates, e.g., "on" and "at", rather than informative ones.
We propose BA-SGG, a framework based on balance adjustment but not the conventional distribution fitting.
Our method achieves 14.3%, 8.0%, and 6.1% higher Mean Recall (mR) than that of the Transformer model at three scene graph generation sub-tasks on Visual Genome.
arXiv Detail & Related papers (2021-08-30T11:39:43Z) - Semantic Compositional Learning for Low-shot Scene Graph Generation [122.51930904132685]
Many scene graph generation (SGG) models solely use the limited annotated relation triples for training.
We propose a novel semantic compositional learning strategy that makes it possible to construct additional, realistic relation triples.
For three recent SGG models, adding our strategy improves their performance by close to 50%, and all of them substantially exceed the current state-of-the-art.
arXiv Detail & Related papers (2021-08-19T10:13:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.