Context-driven Visual Object Recognition based on Knowledge Graphs
- URL: http://arxiv.org/abs/2210.11233v1
- Date: Thu, 20 Oct 2022 13:09:00 GMT
- Title: Context-driven Visual Object Recognition based on Knowledge Graphs
- Authors: Sebastian Monka, Lavdim Halilaj, Achim Rettinger
- Abstract summary: We propose an approach that enhances deep learning methods by using external contextual knowledge encoded in a knowledge graph.
We conduct a series of experiments to investigate the impact of different contextual views on the learned object representations for the same image dataset.
- Score: 0.8701566919381223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current deep learning methods for object recognition are purely data-driven
and require a large number of training samples to achieve good results. Due to
their sole dependence on image data, these methods tend to fail when confronted
with new environments where even small deviations occur. Human perception,
however, has proven to be significantly more robust to such distribution
shifts. It is assumed that their ability to deal with unknown scenarios is
based on extensive incorporation of contextual knowledge. Context can be based
either on object co-occurrences in a scene or on memory of experience. In
accordance with the human visual cortex which uses context to form different
object representations for a seen image, we propose an approach that enhances
deep learning methods by using external contextual knowledge encoded in a
knowledge graph. Therefore, we extract different contextual views from a
generic knowledge graph, transform the views into vector space and infuse it
into a DNN. We conduct a series of experiments to investigate the impact of
different contextual views on the learned object representations for the same
image dataset. The experimental results provide evidence that the contextual
views influence the image representations in the DNN differently and therefore
lead to different predictions for the same images. We also show that context
helps to strengthen the robustness of object recognition models for
out-of-distribution images, usually occurring in transfer learning tasks or
real-world scenarios.
Related papers
- Lost in Context: The Influence of Context on Feature Attribution Methods for Object Recognition [4.674826882670651]
This study investigates how context manipulation influences both model accuracy and feature attribution.
We employ a range of feature attribution techniques to decipher the reliance of deep neural networks on context in object recognition tasks.
arXiv Detail & Related papers (2024-11-05T06:13:01Z) - Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models [0.65268245109828]
We introduce the notion of contextual diversity for active learning CDAL.
We propose a data repair algorithm to curate contextually fair data to reduce model bias.
We are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads.
arXiv Detail & Related papers (2024-11-04T09:43:33Z) - Leveraging Open-Vocabulary Diffusion to Camouflaged Instance
Segmentation [59.78520153338878]
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions.
We propose a method built upon a state-of-the-art diffusion model, empowered by open-vocabulary to learn multi-scale textual-visual features for camouflaged object representations.
arXiv Detail & Related papers (2023-12-29T07:59:07Z) - Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis [20.316056261749946]
We propose an end-to-end vision and language model incorporating explicit knowledge graphs.
We also introduce an interactive out-of-distribution layer using implicit network operator.
In practice, we apply our model on several vision and language downstream tasks including visual question answering, visual reasoning, and image-text retrieval.
arXiv Detail & Related papers (2023-02-11T05:46:21Z) - Perceptual Grouping in Contrastive Vision-Language Models [59.1542019031645]
We show how vision-language models are able to understand where objects reside within an image and group together visually related parts of the imagery.
We propose a minimal set of modifications that results in models that uniquely learn both semantic and spatial information.
arXiv Detail & Related papers (2022-10-18T17:01:35Z) - One-shot Scene Graph Generation [130.57405850346836]
We propose Multiple Structured Knowledge (Relational Knowledgesense Knowledge) for the one-shot scene graph generation task.
Our method significantly outperforms existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-02-22T11:32:59Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - Classifying Textual Data with Pre-trained Vision Models through Transfer
Learning and Data Transformations [0.0]
We propose to use knowledge acquired by benchmark Vision Models which are trained on ImageNet to help a much smaller architecture learn to classify text.
An analysis of different domains and the Transfer Learning method is carried out.
The main contribution of this work is a novel approach which links large pretrained models on both language and vision to achieve state-of-the-art results.
arXiv Detail & Related papers (2021-06-23T15:53:38Z) - Factors of Influence for Transfer Learning across Diverse Appearance
Domains and Task Types [50.1843146606122]
A simple form of transfer learning is common in current state-of-the-art computer vision models.
Previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood.
In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains.
arXiv Detail & Related papers (2021-03-24T16:24:20Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.