Ontology-based n-ball Concept Embeddings Informing Few-shot Image
Classification
- URL: http://arxiv.org/abs/2109.09063v1
- Date: Sun, 19 Sep 2021 05:35:43 GMT
- Title: Ontology-based n-ball Concept Embeddings Informing Few-shot Image
Classification
- Authors: Mirantha Jayathilaka, Tingting Mu, Uli Sattler
- Abstract summary: ViOCE integrates symbolic knowledge in the form of $n$-ball concept embeddings into a neural network based vision architecture.
We evaluate ViOCE using the task of few-shot image classification, where it demonstrates superior performance on two standard benchmarks.
- Score: 5.247029505708008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel framework named ViOCE that integrates ontology-based
background knowledge in the form of $n$-ball concept embeddings into a neural
network based vision architecture. The approach consists of two components -
converting symbolic knowledge of an ontology into continuous space by learning
n-ball embeddings that capture properties of subsumption and disjointness, and
guiding the training and inference of a vision model using the learnt
embeddings. We evaluate ViOCE using the task of few-shot image classification,
where it demonstrates superior performance on two standard benchmarks.
Related papers
- Advancing Ante-Hoc Explainable Models through Generative Adversarial Networks [24.45212348373868]
This paper presents a novel concept learning framework for enhancing model interpretability and performance in visual classification tasks.
Our approach appends an unsupervised explanation generator to the primary classifier network and makes use of adversarial training.
This work presents a significant step towards building inherently interpretable deep vision models with task-aligned concept representations.
arXiv Detail & Related papers (2024-01-09T16:16:16Z) - ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image
Diffusion Models [79.10890337599166]
We introduce ConceptBed, a large-scale dataset that consists of 284 unique visual concepts and 33K composite text prompts.
We evaluate visual concepts that are either objects, attributes, or styles, and also evaluate four dimensions of compositionality: counting, attributes, relations, and actions.
Our results point to a trade-off between learning the concepts and preserving the compositionality which existing approaches struggle to overcome.
arXiv Detail & Related papers (2023-06-07T18:00:38Z) - Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation
for Scene Recognition [64.29650787243443]
We propose and analyse the use of a 2D frequency transform of the activation maps before transferring them.
This strategy enhances knowledge transferability in tasks such as scene recognition.
We publicly release the training and evaluation framework used along this paper at http://www.vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.
arXiv Detail & Related papers (2022-05-04T11:05:18Z) - K-LITE: Learning Transferable Visual Models with External Knowledge [242.3887854728843]
K-LITE (Knowledge-augmented Language-Image Training and Evaluation) is a strategy to leverage external knowledge to build transferable visual systems.
In training, it enriches entities in natural language with WordNet and Wiktionary knowledge.
In evaluation, the natural language is also augmented with external knowledge and then used to reference learned visual concepts.
arXiv Detail & Related papers (2022-04-20T04:47:01Z) - Zero-Shot Compositional Concept Learning [10.108857371774977]
We propose an episode-based cross-attention (EpiCA) network which combines merits of cross-attention mechanism and episode-based training strategy.
EpiCA bases on cross-attention to correlate concept-visual information and utilizes the gated pooling layer to build contextualized representations for both images and concepts.
Experiments on two widely-used zero-shot compositional learning (ZSCL) benchmarks have demonstrated the effectiveness of the model.
arXiv Detail & Related papers (2021-07-12T03:31:56Z) - Knowledge-Guided Object Discovery with Acquired Deep Impressions [41.07379505694274]
We present a framework called Acquired Deep Impressions (ADI) which continuously learns knowledge of objects as "impressions"
ADI first acquires knowledge from scene images containing a single object in a supervised manner.
It then learns from novel multi-object scene images which may contain objects that have not been seen before.
arXiv Detail & Related papers (2021-03-19T03:17:57Z) - Universal Representation Learning of Knowledge Bases by Jointly
Embedding Instances and Ontological Concepts [39.99087114075884]
We propose a novel two-view KG embedding model, JOIE, with the goal to produce better knowledge embedding.
JOIE employs cross-view and intra-view modeling that learn on multiple facets of the knowledge base.
Our model is trained on large-scale knowledge bases that consist of massive instances and their corresponding ontological concepts connected via a (small) set of cross-view links.
arXiv Detail & Related papers (2021-03-15T03:24:37Z) - Interpretable Visual Reasoning via Induced Symbolic Space [75.95241948390472]
We study the problem of concept induction in visual reasoning, i.e., identifying concepts and their hierarchical relationships from question-answer pairs associated with images.
We first design a new framework named object-centric compositional attention model (OCCAM) to perform the visual reasoning task with object-level visual features.
We then come up with a method to induce concepts of objects and relations using clues from the attention patterns between objects' visual features and question words.
arXiv Detail & Related papers (2020-11-23T18:21:49Z) - Visual-Semantic Embedding Model Informed by Structured Knowledge [3.2734466030053175]
We propose a novel approach to improve a visual-semantic embedding model by incorporating concept representations captured from an external structured knowledge base.
We investigate its performance on image classification under both standard and zero-shot settings.
arXiv Detail & Related papers (2020-09-21T17:04:32Z) - Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.
We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts.
Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.