Graph Density-Aware Losses for Novel Compositions in Scene Graph
Generation
- URL: http://arxiv.org/abs/2005.08230v2
- Date: Tue, 18 Aug 2020 00:47:17 GMT
- Title: Graph Density-Aware Losses for Novel Compositions in Scene Graph
Generation
- Authors: Boris Knyazev, Harm de Vries, C\u{a}t\u{a}lina Cangea, Graham W.
Taylor, Aaron Courville, Eugene Belilovsky
- Abstract summary: Scene graph generation aims to predict graph-structured descriptions of input images.
It is important - yet challenging - to perform well on novel (zero-shot) or rare (few-shot) compositions of objects and relationships.
We show that the standard loss used in this task is unintentionally a function of scene graph density.
We introduce a density-normalized edge loss, which provides more than a two-fold improvement in certain generalization metrics.
- Score: 27.535630110794855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene graph generation (SGG) aims to predict graph-structured descriptions of
input images, in the form of objects and relationships between them. This task
is becoming increasingly useful for progress at the interface of vision and
language. Here, it is important - yet challenging - to perform well on novel
(zero-shot) or rare (few-shot) compositions of objects and relationships. In
this paper, we identify two key issues that limit such generalization. Firstly,
we show that the standard loss used in this task is unintentionally a function
of scene graph density. This leads to the neglect of individual edges in large
sparse graphs during training, even though these contain diverse few-shot
examples that are important for generalization. Secondly, the frequency of
relationships can create a strong bias in this task, such that a blind model
predicting the most frequent relationship achieves good performance.
Consequently, some state-of-the-art models exploit this bias to improve
results. We show that such models can suffer the most in their ability to
generalize to rare compositions, evaluating two different models on the Visual
Genome dataset and its more recent, improved version, GQA. To address these
issues, we introduce a density-normalized edge loss, which provides more than a
two-fold improvement in certain generalization metrics. Compared to other works
in this direction, our enhancements require only a few lines of code and no
added computational cost. We also highlight the difficulty of accurately
evaluating models using existing metrics, especially on zero/few shots, and
introduce a novel weighted metric.
Related papers
- Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)
This framework provides a standardized setting to evaluate GNNs across diverse datasets.
We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient
Scene Graph Generation [0.7851536646859476]
We introduce the task of Efficient Scene Graph Generation (SGG) that prioritizes the generation of relevant relations.
We present a new dataset, VG150-curated, based on the annotations of the popular Visual Genome dataset.
We show through a set of experiments that this dataset contains more high-quality and diverse annotations than the one usually use in SGG.
arXiv Detail & Related papers (2023-05-30T00:55:49Z) - Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph
Matching [68.35685422301613]
We propose a novel non-parametric subgraph matching framework, dubbed MatchExplainer, to explore explanatory subgraphs.
It couples the target graph with other counterpart instances and identifies the most crucial joint substructure by minimizing the node corresponding-based distance.
Experiments on synthetic and real-world datasets show the effectiveness of our MatchExplainer by outperforming all state-of-the-art parametric baselines with significant margins.
arXiv Detail & Related papers (2023-01-07T05:14:45Z) - DAGAD: Data Augmentation for Graph Anomaly Detection [57.92471847260541]
This paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs.
A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics.
arXiv Detail & Related papers (2022-10-18T11:28:21Z) - Investigating Neighborhood Modeling and Asymmetry Preservation in
Digraph Representation Learning [12.406793386672208]
Digraph Hyperbolic Network (D-HYPR) learns node representations in hyperbolic space to avoid structural and semantic distortion of real-world digraphs.
Our code and data will be available.
arXiv Detail & Related papers (2021-12-22T08:50:55Z) - Graph-LDA: Graph Structure Priors to Improve the Accuracy in Few-Shot
Classification [6.037383467521294]
We introduce a generic model where observed class signals are supposed to be deteriorated with two sources of noise.
We derive an optimal methodology to classify such signals.
This methodology includes a single parameter, making it particularly suitable for cases where available data is scarce.
arXiv Detail & Related papers (2021-08-23T21:55:45Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning.
We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels.
Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z) - Structural Information Preserving for Graph-to-Text Generation [59.00642847499138]
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.
Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.
arXiv Detail & Related papers (2021-02-12T20:09:01Z) - Generative Compositional Augmentations for Scene Graph Prediction [27.535630110794855]
Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of vision and language.
We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution.
We propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs.
arXiv Detail & Related papers (2020-07-11T12:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.