Knowledge Fused Recognition: Fusing Hierarchical Knowledge for Image Recognition through Quantitative Relativity Modeling and Deep Metric Learning
- URL: http://arxiv.org/abs/2407.20600v1
- Date: Tue, 30 Jul 2024 07:24:33 GMT
- Title: Knowledge Fused Recognition: Fusing Hierarchical Knowledge for Image Recognition through Quantitative Relativity Modeling and Deep Metric Learning
- Authors: Yunfeng Zhao, Huiyu Zhou, Fei Wu, Xifeng Wu,
- Abstract summary: We propose a novel deep metric learning based method to fuse hierarchical prior knowledge about image classes.
Existing deep metric learning incorporated image classification mainly exploits qualitative relativity between image classes.
A new triplet loss function term that exploits quantitative relativity and aligns distances in model latent space with those in knowledge space is also proposed and incorporated in the proposed dual-modality fusion method.
- Score: 18.534970504136254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image recognition is an essential baseline for deep metric learning. Hierarchical knowledge about image classes depicts inter-class similarities or dissimilarities. Effective fusion of hierarchical knowledge about image classes to enhance image recognition remains a challenging topic to advance. In this paper, we propose a novel deep metric learning based method to effectively fuse hierarchical prior knowledge about image classes and enhance image recognition performances in an end-to-end supervised regression manner. Existing deep metric learning incorporated image classification mainly exploits qualitative relativity between image classes, i.e., whether sampled images are from the same class. A new triplet loss function term that exploits quantitative relativity and aligns distances in model latent space with those in knowledge space is also proposed and incorporated in the proposed dual-modality fusion method. Experimental results indicate that the proposed method enhanced image recognition performances and outperformed baseline and existing methods on CIFAR-10, CIFAR-100, Mini-ImageNet, and ImageNet-1K datasets.
Related papers
- Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision
Transformers for High-Level Image Classification [0.1843404256219181]
We leverage situated perceptual knowledge of cultural images to enhance performance and interpretability in AC image classification.
This resource captures situated perceptual semantics gleaned from over 14,000 cultural images labeled with ACs.
We demonstrate the synergy and complementarity between KGE embeddings' situated perceptual knowledge and deep visual model's sensory-perceptual understanding for AC image classification.
arXiv Detail & Related papers (2024-02-29T16:46:48Z) - Introspective Deep Metric Learning [91.47907685364036]
We propose an introspective deep metric learning framework for uncertainty-aware comparisons of images.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling.
arXiv Detail & Related papers (2023-09-11T16:21:13Z) - Label-Free Event-based Object Recognition via Joint Learning with Image
Reconstruction from Events [42.71383489578851]
We study label-free event-based object recognition where category labels and paired images are not available.
Our method first reconstructs images from events and performs object recognition through Contrastive Language-Image Pre-training (CLIP)
Since the category information is essential in reconstructing images, we propose category-guided attraction loss and category-agnostic repulsion loss.
arXiv Detail & Related papers (2023-08-18T08:28:17Z) - EAML: Ensemble Self-Attention-based Mutual Learning Network for Document
Image Classification [1.1470070927586016]
We design a self-attention-based fusion module that serves as a block in our ensemble trainable network.
It allows to simultaneously learn the discriminant features of image and text modalities throughout the training stage.
This is the first time to leverage a mutual learning approach along with a self-attention-based fusion module to perform document image classification.
arXiv Detail & Related papers (2023-05-11T16:05:03Z) - Introspective Deep Metric Learning for Image Retrieval [80.29866561553483]
We argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training.
We propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets.
arXiv Detail & Related papers (2022-05-09T17:51:44Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - Learning semantic Image attributes using Image recognition and knowledge
graph embeddings [0.3222802562733786]
We propose a shared learning approach to learn semantic attributes of images by combining a knowledge graph embedding model with the recognized attributes of images.
The proposed approach is a step towards bridging the gap between frameworks which learn from large amounts of data and frameworks which use a limited set of predicates to infer new knowledge.
arXiv Detail & Related papers (2020-09-12T15:18:48Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning [122.51237307910878]
We develop methods for few-shot image classification from a new perspective of optimal matching between image regions.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations.
To generate the important weights of elements in the formulation, we design a cross-reference mechanism.
arXiv Detail & Related papers (2020-03-15T08:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.