RA-SGG: Retrieval-Augmented Scene Graph Generation Framework via Multi-Prototype Learning
- URL: http://arxiv.org/abs/2412.12788v1
- Date: Tue, 17 Dec 2024 10:47:13 GMT
- Title: RA-SGG: Retrieval-Augmented Scene Graph Generation Framework via Multi-Prototype Learning
- Authors: Kanghoon Yoon, Kibum Kim, Jaehyung Jeon, Yeonjun In, Donghyun Kim, Chanyoung Park,
- Abstract summary: Scene Graph Generation (SGG) research has suffered from two fundamental challenges: the long-tailed predicate distribution and semantic ambiguity between predicates.
We propose Retrieval-Augmented Scene Graph Generation (RA-SGG), which identifies potential instances to be multi-labeled and enriches the single-label with multi-labels that are semantically similar to the original label.
RA-SGG effectively alleviates the issue of biased prediction caused by the long-tailed distribution and semantic ambiguity of predicates.
- Score: 24.52282123604646
- License:
- Abstract: Scene Graph Generation (SGG) research has suffered from two fundamental challenges: the long-tailed predicate distribution and semantic ambiguity between predicates. These challenges lead to a bias towards head predicates in SGG models, favoring dominant general predicates while overlooking fine-grained predicates. In this paper, we address the challenges of SGG by framing it as multi-label classification problem with partial annotation, where relevant labels of fine-grained predicates are missing. Under the new frame, we propose Retrieval-Augmented Scene Graph Generation (RA-SGG), which identifies potential instances to be multi-labeled and enriches the single-label with multi-labels that are semantically similar to the original label by retrieving relevant samples from our established memory bank. Based on augmented relations (i.e., discovered multi-labels), we apply multi-prototype learning to train our SGG model. Several comprehensive experiments have demonstrated that RA-SGG outperforms state-of-the-art baselines by up to 3.6% on VG and 5.9% on GQA, particularly in terms of F@K, showing that RA-SGG effectively alleviates the issue of biased prediction caused by the long-tailed distribution and semantic ambiguity of predicates.
Related papers
- Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction [12.319354506916547]
We propose a novel Sample-Level Bias Prediction (SBP) method for fine-grained Scene Graph Generation (SGG)
Firstly, we train a classic SGG model and construct a correction bias set.
Then, we devise a Bias-Oriented Generative Adversarial Network (BGAN) that learns to predict the constructed correction biases.
arXiv Detail & Related papers (2024-07-27T13:49:06Z) - Leveraging Predicate and Triplet Learning for Scene Graph Generation [31.09787444957997]
Scene Graph Generation (SGG) aims to identify entities and predict the relationship triplets.
We propose a Dual-granularity Relation Modeling (DRM) network to leverage fine-grained triplet cues besides the coarse-grained predicate ones.
Our method establishes new state-of-the-art performance on Visual Genome, Open Image, and GQA datasets.
arXiv Detail & Related papers (2024-06-04T07:23:41Z) - Adaptive Self-training Framework for Fine-grained Scene Graph Generation [29.37568710952893]
Scene graph generation (SGG) models have suffered from inherent problems regarding the benchmark datasets.
We introduce a Self-Training framework for SGG (ST-SGG) that assigns pseudo-labels for unannotated triplets.
Our experiments verify the effectiveness of ST-SGG on various SGG models.
arXiv Detail & Related papers (2024-01-18T08:10:34Z) - ALF: Adaptive Label Finetuning for Scene Graph Generation [116.59868289196157]
Scene Graph Generation endeavors to predict the relationships between subjects and objects in a given image.
Long-tail distribution of relations often leads to biased prediction on coarse labels, presenting a substantial hurdle in SGG.
We introduce one-stage data transfer pipeline in SGG, termed Adaptive Label Finetuning (ALF), which eliminates the need for extra retraining sessions.
ALF achieves a 16% improvement in mR@100 compared to the typical SGG method Motif, with only a 6% increase in calculation costs compared to the state-of-the-art method IETrans.
arXiv Detail & Related papers (2023-12-29T01:37:27Z) - Compositional Feature Augmentation for Unbiased Scene Graph Generation [28.905732042942066]
Scene Graph Generation (SGG) aims to detect all the visual relation triplets sub, pred, obj> in a given image.
Due to the ubiquitous long-tailed predicate distributions, today's SGG models are still easily biased to the head predicates.
We propose a novel Compositional Feature Augmentation (CFA) strategy, which is the first unbiased SGG work to mitigate the bias issue.
arXiv Detail & Related papers (2023-08-13T08:02:14Z) - Label Semantic Knowledge Distillation for Unbiased Scene Graph
Generation [34.20922091969159]
We propose a novel model-agnostic Label Semantic Knowledge Distillation (LS-KD) for unbiased Scene Graph Generation (SGG)
LS-KD dynamically generates a soft label for each subject-object instance by fusing a predicted Label Semantic Distribution (LSD) with its original one-hot target label.
arXiv Detail & Related papers (2022-08-07T16:19:19Z) - NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation [65.78472854070316]
We propose a novel NoIsy label CorrEction and Sample Training strategy for SGG: NICEST.
NICE first detects noisy samples and then reassigns them more high-quality soft predicate labels.
NICEST can be seamlessly incorporated into any SGG architecture to boost its performance on different predicate categories.
arXiv Detail & Related papers (2022-07-27T06:25:47Z) - Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories.
We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG.
Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z) - Fine-Grained Predicates Learning for Scene Graph Generation [155.48614435437355]
Fine-Grained Predicates Learning aims at differentiating among hard-to-distinguish predicates for Scene Graph Generation task.
We introduce a Predicate Lattice that helps SGG models to figure out fine-grained predicate pairs.
We then propose a Category Discriminating Loss and an Entity Discriminating Loss, which both contribute to distinguishing fine-grained predicates.
arXiv Detail & Related papers (2022-04-06T06:20:09Z) - Hierarchical Memory Learning for Fine-Grained Scene Graph Generation [49.39355372599507]
This paper proposes a novel Hierarchical Memory Learning (HML) framework to learn the model from simple to complex.
After the autonomous partition of coarse and fine predicates, the model is first trained on the coarse predicates and then learns the fine predicates.
arXiv Detail & Related papers (2022-03-14T08:01:14Z) - PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph
Generation [58.98802062945709]
We propose a novel Predicate-Correlation Perception Learning scheme to adaptively seek out appropriate loss weights.
Our PCPL framework is further equipped with a graph encoder module to better extract context features.
arXiv Detail & Related papers (2020-09-02T08:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.