Relational Prior Knowledge Graphs for Detection and Instance
Segmentation
- URL: http://arxiv.org/abs/2310.07573v1
- Date: Wed, 11 Oct 2023 15:15:05 GMT
- Title: Relational Prior Knowledge Graphs for Detection and Instance
Segmentation
- Authors: Osman \"Ulger, Yu Wang, Ysbrand Galama, Sezer Karaoglu, Theo Gevers,
Martin R. Oswald
- Abstract summary: We propose a graph that enhances object features using priors.
Experimental evaluations on COCO show that the utilization of scene graphs, augmented with relational priors, offer benefits for object detection and instance segmentation.
- Score: 24.360473253478112
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans have a remarkable ability to perceive and reason about the world
around them by understanding the relationships between objects. In this paper,
we investigate the effectiveness of using such relationships for object
detection and instance segmentation. To this end, we propose a Relational
Prior-based Feature Enhancement Model (RP-FEM), a graph transformer that
enhances object proposal features using relational priors. The proposed
architecture operates on top of scene graphs obtained from initial proposals
and aims to concurrently learn relational context modeling for object detection
and instance segmentation. Experimental evaluations on COCO show that the
utilization of scene graphs, augmented with relational priors, offer benefits
for object detection and instance segmentation. RP-FEM demonstrates its
capacity to suppress improbable class predictions within the image while also
preventing the model from generating duplicate predictions, leading to
improvements over the baseline model on which it is built.
Related papers
- A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - EGTR: Extracting Graph from Transformer for Scene Graph Generation [5.935927309154952]
Scene Graph Generation (SGG) is a challenging task of detecting objects and predicting relationships between objects.
We propose a lightweight one-stage SGG model that extracts the relation graph from the various relationships learned in the multi-head self-attention layers of the DETR decoder.
We demonstrate the effectiveness and efficiency of our method for the Visual Genome and Open Image V6 datasets.
arXiv Detail & Related papers (2024-04-02T16:20:02Z) - Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection [14.22646492640906]
We propose a simple and highly efficient decoder-free architecture for open-vocabulary visual relationship detection.
Our model consists of a Transformer-based image encoder that represents objects as tokens and models their relationships implicitly.
Our approach achieves state-of-the-art relationship detection performance on Visual Genome and on the large-vocabulary GQA benchmark at real-time inference speeds.
arXiv Detail & Related papers (2024-03-21T10:15:57Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - Detecting Objects with Context-Likelihood Graphs and Graph Refinement [45.70356990655389]
The goal of this paper is to detect objects by exploiting their ins. Contrary to existing methods, which learn objects and relations separately, our key idea is to learn the object-relation distribution jointly.
We propose a novel way of creating a graphical representation of an image from inter-object relations and initial class predictions, we call a context-likelihood graph.
We then learn the joint with an energy-based modeling technique which allows a sample and refine the context-likelihood graph iteratively for a given image.
arXiv Detail & Related papers (2022-12-23T15:27:21Z) - Relation Regularized Scene Graph Generation [206.76762860019065]
Scene graph generation (SGG) is built on top of detected objects to predict object pairwise visual relations.
We propose a relation regularized network (R2-Net) which can predict whether there is a relationship between two objects.
Our R2-Net can effectively refine object labels and generate scene graphs.
arXiv Detail & Related papers (2022-02-22T11:36:49Z) - Instance-Level Relative Saliency Ranking with Graph Reasoning [126.09138829920627]
We present a novel unified model to segment salient instances and infer relative saliency rank order.
A novel loss function is also proposed to effectively train the saliency ranking branch.
experimental results demonstrate that our proposed model is more effective than previous methods.
arXiv Detail & Related papers (2021-07-08T13:10:42Z) - Unified Graph Structured Models for Video Understanding [93.72081456202672]
We propose a message passing graph neural network that explicitly models relational-temporal relations.
We show how our method is able to more effectively model relationships between relevant entities in the scene.
arXiv Detail & Related papers (2021-03-29T14:37:35Z) - Tensor Composition Net for Visual Relationship Prediction [115.14829858763399]
We present a novel Composition Network (TCN) to predict visual relationships in images.
The key idea of our TCN is to exploit the low rank property of the visual relationship tensor.
We show our TCN's image-level visual relationship prediction provides a simple and efficient mechanism for relation-based image retrieval.
arXiv Detail & Related papers (2020-12-10T06:27:20Z) - Improving Relation Extraction by Leveraging Knowledge Graph Link
Prediction [5.820381428297218]
We propose a multi-task learning approach that improves the performance of RE models by jointly training on RE and KGLP tasks.
We illustrate the generality of our approach by applying it on several existing RE models and empirically demonstrate how it helps them achieve consistent performance gains.
arXiv Detail & Related papers (2020-12-09T01:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.