Generative Compositional Augmentations for Scene Graph Prediction
- URL: http://arxiv.org/abs/2007.05756v3
- Date: Fri, 1 Oct 2021 15:33:30 GMT
- Title: Generative Compositional Augmentations for Scene Graph Prediction
- Authors: Boris Knyazev, Harm de Vries, C\u{a}t\u{a}lina Cangea, Graham W.
Taylor, Aaron Courville, Eugene Belilovsky
- Abstract summary: Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of vision and language.
We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution.
We propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs.
- Score: 27.535630110794855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inferring objects and their relationships from an image in the form of a
scene graph is useful in many applications at the intersection of vision and
language. We consider a challenging problem of compositional generalization
that emerges in this task due to a long tail data distribution. Current scene
graph generation models are trained on a tiny fraction of the distribution
corresponding to the most frequent compositions, e.g. <cup, on, table>.
However, test images might contain zero- and few-shot compositions of objects
and relationships, e.g. <cup, on, surfboard>. Despite each of the object
categories and the predicate (e.g. 'on') being frequent in the training data,
the models often fail to properly understand such unseen or rare compositions.
To improve generalization, it is natural to attempt increasing the diversity of
the training distribution. However, in the graph domain this is non-trivial. To
that end, we propose a method to synthesize rare yet plausible scene graphs by
perturbing real ones. We then propose and empirically study a model based on
conditional generative adversarial networks (GANs) that allows us to generate
visual features of perturbed scene graphs and learn from them in a joint
fashion. When evaluated on the Visual Genome dataset, our approach yields
marginal, but consistent improvements in zero- and few-shot metrics. We analyze
the limitations of our approach indicating promising directions for future
research.
Related papers
- Joint Generative Modeling of Scene Graphs and Images via Diffusion
Models [37.788957749123725]
We present a novel generative task: joint scene graph - image generation.
We introduce a novel diffusion model, DiffuseSG, that jointly models the adjacency matrix along with heterogeneous node and edge attributes.
With a graph transformer being the denoiser, DiffuseSG successively denoises the scene graph representation in a continuous space and discretizes the final representation to generate the clean scene graph.
arXiv Detail & Related papers (2024-01-02T10:10:29Z) - Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient
Scene Graph Generation [0.7851536646859476]
We introduce the task of Efficient Scene Graph Generation (SGG) that prioritizes the generation of relevant relations.
We present a new dataset, VG150-curated, based on the annotations of the popular Visual Genome dataset.
We show through a set of experiments that this dataset contains more high-quality and diverse annotations than the one usually use in SGG.
arXiv Detail & Related papers (2023-05-30T00:55:49Z) - Learnable Graph Matching: A Practical Paradigm for Data Association [74.28753343714858]
We propose a general learnable graph matching method to address these issues.
Our method achieves state-of-the-art performance on several MOT datasets.
For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet.
arXiv Detail & Related papers (2023-03-27T17:39:00Z) - Unconditional Scene Graph Generation [72.53624470737712]
We develop a deep auto-regressive model called SceneGraphGen which can learn the probability distribution over labelled and directed graphs.
We show that the scene graphs generated by SceneGraphGen are diverse and follow the semantic patterns of real-world scenes.
arXiv Detail & Related papers (2021-08-12T17:57:16Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Semi-Supervised Graph-to-Graph Translation [31.47555366566109]
Graph translation is a promising research direction and has a wide range of potential real-world applications.
One important reason is the lack of high-quality paired dataset.
We propose to construct a dual representation space, where transformation is performed explicitly to model the semantic transitions.
arXiv Detail & Related papers (2021-03-16T03:24:20Z) - Dual ResGCN for Balanced Scene GraphGeneration [106.7828712878278]
We propose a novel model, dubbed textitdual ResGCN, which consists of an object residual graph convolutional network and a relation residual graph convolutional network.
The two networks are complementary to each other. The former captures object-level context information, textiti.e., the connections among objects.
The latter is carefully designed to explicitly capture relation-level context information textiti.e., the connections among relations.
arXiv Detail & Related papers (2020-11-09T07:44:17Z) - Multilayer Clustered Graph Learning [66.94201299553336]
We use contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph.
Experiments show that our method leads to a clustered clusters w.r.t.
We learn a clustering algorithm for solving clustering problems.
arXiv Detail & Related papers (2020-10-29T09:58:02Z) - Graph Density-Aware Losses for Novel Compositions in Scene Graph
Generation [27.535630110794855]
Scene graph generation aims to predict graph-structured descriptions of input images.
It is important - yet challenging - to perform well on novel (zero-shot) or rare (few-shot) compositions of objects and relationships.
We show that the standard loss used in this task is unintentionally a function of scene graph density.
We introduce a density-normalized edge loss, which provides more than a two-fold improvement in certain generalization metrics.
arXiv Detail & Related papers (2020-05-17T11:45:29Z) - Bridging Knowledge Graphs to Generate Scene Graphs [49.69377653925448]
We propose a novel graph-based neural network that iteratively propagates information between the two graphs, as well as within each of them.
Our Graph Bridging Network, GB-Net, successively infers edges and nodes, allowing to simultaneously exploit and refine the rich, heterogeneous structure of the interconnected scene and commonsense graphs.
arXiv Detail & Related papers (2020-01-07T23:35:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.