Knowledge-Guided Object Discovery with Acquired Deep Impressions
- URL: http://arxiv.org/abs/2103.10611v1
- Date: Fri, 19 Mar 2021 03:17:57 GMT
- Title: Knowledge-Guided Object Discovery with Acquired Deep Impressions
- Authors: Jinyang Yuan, Bin Li, Xiangyang Xue
- Abstract summary: We present a framework called Acquired Deep Impressions (ADI) which continuously learns knowledge of objects as "impressions"
ADI first acquires knowledge from scene images containing a single object in a supervised manner.
It then learns from novel multi-object scene images which may contain objects that have not been seen before.
- Score: 41.07379505694274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a framework called Acquired Deep Impressions (ADI) which
continuously learns knowledge of objects as "impressions" for compositional
scene understanding. In this framework, the model first acquires knowledge from
scene images containing a single object in a supervised manner, and then
continues to learn from novel multi-object scene images which may contain
objects that have not been seen before without any further supervision, under
the guidance of the learned knowledge as humans do. By memorizing impressions
of objects into parameters of neural networks and applying the generative
replay strategy, the learned knowledge can be reused to generate images with
pseudo-annotations and in turn assist the learning of novel scenes. The
proposed ADI framework focuses on the acquisition and utilization of knowledge,
and is complementary to existing deep generative models proposed for
compositional scene representation. We adapt a base model to make it fall
within the ADI framework and conduct experiments on two types of datasets.
Empirical results suggest that the proposed framework is able to effectively
utilize the acquired impressions and improve the scene decomposition
performance.
Related papers
- Augmented Commonsense Knowledge for Remote Object Grounding [67.30864498454805]
We propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as atemporal knowledge graph for improving agent navigation.
ACK consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment.
We add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction.
arXiv Detail & Related papers (2024-06-03T12:12:33Z) - Context-driven Visual Object Recognition based on Knowledge Graphs [0.8701566919381223]
We propose an approach that enhances deep learning methods by using external contextual knowledge encoded in a knowledge graph.
We conduct a series of experiments to investigate the impact of different contextual views on the learned object representations for the same image dataset.
arXiv Detail & Related papers (2022-10-20T13:09:00Z) - Learning by Asking Questions for Knowledge-based Novel Object
Recognition [64.55573343404572]
In real-world object recognition, there are numerous object classes to be recognized. Conventional image recognition based on supervised learning can only recognize object classes that exist in the training data, and thus has limited applicability in the real world.
Inspired by this, we study a framework for acquiring external knowledge through question generation that would help the model instantly recognize novel objects.
Our pipeline consists of two components: the Object-based object recognition, and the Question Generator, which generates knowledge-aware questions to acquire novel knowledge.
arXiv Detail & Related papers (2022-10-12T02:51:58Z) - Few-Shot Object Detection by Knowledge Distillation Using
Bag-of-Visual-Words Representations [58.48995335728938]
We design a novel knowledge distillation framework to guide the learning of the object detector.
We first present a novel Position-Aware Bag-of-Visual-Words model for learning a representative bag of visual words.
We then perform knowledge distillation based on the fact that an image should have consistent BoVW representations in two different feature spaces.
arXiv Detail & Related papers (2022-07-25T10:40:40Z) - K-LITE: Learning Transferable Visual Models with External Knowledge [242.3887854728843]
K-LITE (Knowledge-augmented Language-Image Training and Evaluation) is a strategy to leverage external knowledge to build transferable visual systems.
In training, it enriches entities in natural language with WordNet and Wiktionary knowledge.
In evaluation, the natural language is also augmented with external knowledge and then used to reference learned visual concepts.
arXiv Detail & Related papers (2022-04-20T04:47:01Z) - Compositional Scene Representation Learning via Reconstruction: A Survey [48.33349317481124]
Compositional scene representation learning is a task that enables such abilities.
Deep neural networks have been proven to be advantageous in representation learning.
Learning via reconstruction is advantageous because it may utilize massive unlabeled data and avoid costly and laborious data annotation.
arXiv Detail & Related papers (2022-02-15T02:14:05Z) - Ontology-based n-ball Concept Embeddings Informing Few-shot Image
Classification [5.247029505708008]
ViOCE integrates symbolic knowledge in the form of $n$-ball concept embeddings into a neural network based vision architecture.
We evaluate ViOCE using the task of few-shot image classification, where it demonstrates superior performance on two standard benchmarks.
arXiv Detail & Related papers (2021-09-19T05:35:43Z) - Learning semantic Image attributes using Image recognition and knowledge
graph embeddings [0.3222802562733786]
We propose a shared learning approach to learn semantic attributes of images by combining a knowledge graph embedding model with the recognized attributes of images.
The proposed approach is a step towards bridging the gap between frameworks which learn from large amounts of data and frameworks which use a limited set of predicates to infer new knowledge.
arXiv Detail & Related papers (2020-09-12T15:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.