ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
- URL: http://arxiv.org/abs/2207.00590v1
- Date: Mon, 4 Jul 2022 16:56:45 GMT
- Title: ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
- Authors: Daniel Zeng, Tailin Wu, Jure Leskovec
- Abstract summary: ViRel is a method for unsupervised discovery and learning of Visual Relations with graph-level analogy.
We show that our method achieves above 95% accuracy in relation classification.
We further generalizes to unseen tasks with more complicated relational structures.
- Score: 65.5580334698777
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual relations form the basis of understanding our compositional world, as
relationships between visual objects capture key information in a scene. It is
then advantageous to learn relations automatically from the data, as learning
with predefined labels cannot capture all possible relations. However, current
relation learning methods typically require supervision, and are not designed
to generalize to scenes with more complicated relational structures than those
seen during training. Here, we introduce ViRel, a method for unsupervised
discovery and learning of Visual Relations with graph-level analogy. In a
setting where scenes within a task share the same underlying relational
subgraph structure, our learning method of contrasting isomorphic and
non-isomorphic graphs discovers the relations across tasks in an unsupervised
manner. Once the relations are learned, ViRel can then retrieve the shared
relational graph structure for each task by parsing the predicted relational
structure. Using a dataset based on grid-world and the Abstract Reasoning
Corpus, we show that our method achieves above 95% accuracy in relation
classification, discovers the relation graph structure for most tasks, and
further generalizes to unseen tasks with more complicated relational
structures.
Related papers
- Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - Learning Hierarchical Relational Representations through Relational Convolutions [2.5322020135765464]
We introduce "relational convolutional networks", a neural architecture equipped with computational mechanisms that capture progressively more complex relational features.
A key component of this framework is a novel operation that captures relational patterns in groups of objects by convolving graphlet filters.
We present the motivation and details of the architecture, together with a set of experiments to demonstrate how relational convolutional networks can provide an effective framework for modeling relational tasks that have hierarchical structure.
arXiv Detail & Related papers (2023-10-05T01:22:50Z) - One-shot Scene Graph Generation [130.57405850346836]
We propose Multiple Structured Knowledge (Relational Knowledgesense Knowledge) for the one-shot scene graph generation task.
Our method significantly outperforms existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-02-22T11:32:59Z) - Learning to Compose Visual Relations [100.45138490076866]
We propose to represent each relation as an unnormalized density (an energy-based model)
We show that such a factorized decomposition allows the model to both generate and edit scenes with multiple sets of relations more faithfully.
arXiv Detail & Related papers (2021-11-17T18:51:29Z) - Scenes and Surroundings: Scene Graph Generation using Relation
Transformer [13.146732454123326]
This work proposes a novel local-context aware architecture named relation transformer.
Our hierarchical multi-head attention-based approach efficiently captures contextual dependencies between objects and predicts their relationships.
In comparison to state-of-the-art approaches, we have achieved an overall mean textbf4.85% improvement.
arXiv Detail & Related papers (2021-07-12T14:22:20Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z) - Towards Interpretable Multi-Task Learning Using Bilevel Programming [18.293397644865454]
Interpretable Multi-Task Learning can be expressed as learning a sparse graph of the task relationship based on the prediction performance of the learned models.
We show empirically how the induced sparse graph improves the interpretability of the learned models and their relationship on synthetic and real data, without sacrificing generalization performance.
arXiv Detail & Related papers (2020-09-11T15:04:27Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Structured Query-Based Image Retrieval Using Scene Graphs [10.475553340127394]
We present a method which uses scene graph embeddings as the basis for an approach to image retrieval.
We are able to achieve high recall even on low to medium frequency objects found in the long-tailed COCO-Stuff dataset.
arXiv Detail & Related papers (2020-05-13T22:40:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.