Unbiased Scene Graph Generation using Predicate Similarities
- URL: http://arxiv.org/abs/2210.00920v1
- Date: Mon, 3 Oct 2022 13:28:01 GMT
- Title: Unbiased Scene Graph Generation using Predicate Similarities
- Authors: Misaki Ohashi, Yusuke Matsui
- Abstract summary: Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images.
These applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions.
We propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups.
The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks.
- Score: 7.9112365100345965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene Graphs are widely applied in computer vision as a graphical
representation of relationships between objects shown in images. However, these
applications have not yet reached a practical stage of development owing to
biased training caused by long-tailed predicate distributions. In recent years,
many studies have tackled this problem. In contrast, relatively few works have
considered predicate similarities as a unique dataset feature which also leads
to the biased prediction. Due to the feature, infrequent predicates (e.g.,
parked on, covered in) are easily misclassified as closely-related frequent
predicates (e.g., on, in). Utilizing predicate similarities, we propose a new
classification scheme that branches the process to several fine-grained
classifiers for similar predicate groups. The classifiers aim to capture the
differences among similar predicates in detail. We also introduce the idea of
transfer learning to enhance the features for the predicates which lack
sufficient training samples to learn the descriptive representations. The
results of extensive experiments on the Visual Genome dataset show that the
combination of our method and an existing debiasing approach greatly improves
performance on tail predicates in challenging SGCls/SGDet tasks. Nonetheless,
the overall performance of the proposed approach does not reach that of the
current state of the art, so further analysis remains necessary as future work.
Related papers
- Ensemble Predicate Decoding for Unbiased Scene Graph Generation [40.01591739856469]
Scene Graph Generation (SGG) aims to generate a comprehensive graphical representation that captures semantic information of a given scenario.
The model's performance in predicting more fine-grained predicates is hindered by a significant predicate bias.
This paper proposes Ensemble Predicate Decoding (EPD), which employs multiple decoders to attain unbiased scene graph generation.
arXiv Detail & Related papers (2024-08-26T11:24:13Z) - Panoptic Scene Graph Generation with Semantics-Prototype Learning [23.759498629378772]
Panoptic Scene Graph Generation (PSG) parses objects and predicts their relationships (predicate) to connect human language and visual scenes.
Different language preferences of annotators and semantic overlaps between predicates lead to biased predicate annotations.
We propose a novel framework named ADTrans to adaptively transfer biased predicate annotations to informative and unified ones.
arXiv Detail & Related papers (2023-07-28T14:04:06Z) - Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph
Generation [55.429541407920304]
Recognizing the predicate between subject and object pairs is imbalanced and multi-label in nature.
Recent state-of-the-art methods predominantly focus on the most frequently occurring predicate classes.
We introduce a multi-label meta-learning framework to deal with the biased predicate distribution.
arXiv Detail & Related papers (2023-06-16T18:14:23Z) - Visually-Prompted Language Model for Fine-Grained Scene Graph Generation
in an Open World [67.03968403301143]
Scene Graph Generation (SGG) aims to extract subject, predicate, object> relationships in images for vision understanding.
Existing re-balancing strategies try to handle it via prior rules but are still confined to pre-defined conditions.
We propose a Cross-modal prediCate boosting (CaCao) framework, where a visually-prompted language model is learned to generate diverse fine-grained predicates.
arXiv Detail & Related papers (2023-03-23T13:06:38Z) - Decomposed Prototype Learning for Few-Shot Scene Graph Generation [28.796734816086065]
We focus on a new promising task of scene graph generation (SGG): few-shot SGG (FSSGG)
FSSGG encourages models to be able to quickly transfer previous knowledge and recognize novel predicates with only a few examples.
We propose a novel Decomposed Prototype Learning (DPL)
arXiv Detail & Related papers (2023-03-20T04:54:26Z) - Peer Learning for Unbiased Scene Graph Generation [16.69329808479805]
We propose a novel framework dubbed peer learning to deal with the problem of biased scene graph generation (SGG)
This framework uses predicate sampling and consensus voting (PSCV) to encourage different peers to learn from each other.
We have established a new state-of-the-art (SOTA) on the SGCls task by achieving a mean of bf31.6.
arXiv Detail & Related papers (2022-12-31T07:56:35Z) - CAME: Context-aware Mixture-of-Experts for Unbiased Scene Graph
Generation [10.724516317292926]
We present a simple yet effective method called Context-Aware Mixture-of-Experts (CAME) to improve the model diversity and alleviate the biased scene graph generator.
We have conducted extensive experiments on three tasks on the Visual Genome dataset to show that came achieved superior performance over previous methods.
arXiv Detail & Related papers (2022-08-15T10:39:55Z) - Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories.
We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG.
Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z) - Revisiting Contrastive Methods for Unsupervised Learning of Visual
Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection.
In this paper, we first study how biases in the dataset affect existing methods.
We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z) - PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph
Generation [58.98802062945709]
We propose a novel Predicate-Correlation Perception Learning scheme to adaptively seek out appropriate loss weights.
Our PCPL framework is further equipped with a graph encoder module to better extract context features.
arXiv Detail & Related papers (2020-09-02T08:30:09Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.