Deep Semantic Dictionary Learning for Multi-label Image Classification
- URL: http://arxiv.org/abs/2012.12509v2
- Date: Fri, 2 Apr 2021 12:22:09 GMT
- Title: Deep Semantic Dictionary Learning for Multi-label Image Classification
- Authors: Fengtao Zhou and Sheng Huang and Yun Xing
- Abstract summary: We present an innovative path towards the solution of the multi-label image classification which considers it as a dictionary learning task.
A novel end-to-end model named Deep Semantic Dictionary Learning (DSDL) is designed.
Our codes and models have been released.
- Score: 3.3989824361632337
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Compared with single-label image classification, multi-label image
classification is more practical and challenging. Some recent studies attempted
to leverage the semantic information of categories for improving multi-label
image classification performance. However, these semantic-based methods only
take semantic information as type of complements for visual representation
without further exploitation. In this paper, we present an innovative path
towards the solution of the multi-label image classification which considers it
as a dictionary learning task. A novel end-to-end model named Deep Semantic
Dictionary Learning (DSDL) is designed. In DSDL, an auto-encoder is applied to
generate the semantic dictionary from class-level semantics and then such
dictionary is utilized for representing the visual features extracted by
Convolutional Neural Network (CNN) with label embeddings. The DSDL provides a
simple but elegant way to exploit and reconcile the label, semantic and visual
spaces simultaneously via conducting the dictionary learning among them.
Moreover, inspired by iterative optimization of traditional dictionary
learning, we further devise a novel training strategy named Alternately
Parameters Update Strategy (APUS) for optimizing DSDL, which alternately
optimizes the representation coefficients and the semantic dictionary in
forward and backward propagation. Extensive experimental results on three
popular benchmarks demonstrate that our method achieves promising performances
in comparison with the state-of-the-arts. Our codes and models have been
released at {https://github.com/ZFT-CQU/DSDL}.
Related papers
- Vocabulary-free Image Classification and Semantic Segmentation [71.78089106671581]
We introduce the Vocabulary-free Image Classification (VIC) task, which aims to assign a class from an un-constrained language-induced semantic space to an input image without needing a known vocabulary.
VIC is challenging due to the vastness of the semantic space, which contains millions of concepts, including fine-grained categories.
We propose Category Search from External Databases (CaSED), a training-free method that leverages a pre-trained vision-language model and an external database.
arXiv Detail & Related papers (2024-04-16T19:27:21Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Contextual Dictionary Lookup for Knowledge Graph Completion [32.493168863565465]
Knowledge graph completion (KGC) aims to solve the incompleteness of knowledge graphs (KGs) by predicting missing links from known triples.
Most existing embedding models map each relation into a unique vector, overlooking the specific fine-grained semantics of them under different entities.
We present a novel method utilizing contextual dictionary lookup, enabling conventional embedding models to learn fine-grained semantics of relations in an end-to-end manner.
arXiv Detail & Related papers (2023-06-13T12:13:41Z) - Vocabulary-free Image Classification [75.38039557783414]
We formalize a novel task, termed as Vocabulary-free Image Classification (VIC)
VIC aims to assign to an input image a class that resides in an unconstrained language-induced semantic space, without the prerequisite of a known vocabulary.
CaSED is a method that exploits a pre-trained vision-language model and an external vision-language database to address VIC in a training-free manner.
arXiv Detail & Related papers (2023-06-01T17:19:43Z) - Deep Dictionary Learning with An Intra-class Constraint [23.679645826983503]
We propose a novel deep dictionary learning model with an intra-class constraint (DDLIC) for visual classification.
Specifically, we design the intra-class compactness constraint on the intermediate representation at different levels to encourage the intra-class representations to be closer to each other.
Unlike the traditional DDL methods, during the classification stage, our DDLIC performs a layer-wise greedy optimization in a similar way to the training stage.
arXiv Detail & Related papers (2022-07-14T11:54:58Z) - Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge
Transfer [55.885555581039895]
Multi-label zero-shot learning (ML-ZSL) focuses on transferring knowledge by a pre-trained textual label embedding.
We propose a novel open-vocabulary framework, named multimodal knowledge transfer (MKT) for multi-label classification.
arXiv Detail & Related papers (2022-07-05T08:32:18Z) - Multi-Label Image Classification with Contrastive Learning [57.47567461616912]
We show that a direct application of contrastive learning can hardly improve in multi-label cases.
We propose a novel framework for multi-label classification with contrastive learning in a fully supervised setting.
arXiv Detail & Related papers (2021-07-24T15:00:47Z) - Multi-layered Semantic Representation Network for Multi-label Image
Classification [8.17894017454724]
Multi-label image classification (MLIC) is a fundamental and practical task, which aims to assign multiple possible labels to an image.
In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations.
This paper advances this research direction by improving the modeling of label correlations and the learning of semantic representations.
arXiv Detail & Related papers (2021-06-22T08:04:22Z) - DLDL: Dynamic Label Dictionary Learning via Hypergraph Regularization [17.34373273007931]
We propose a Dynamic Label Dictionary Learning (DLDL) algorithm to generate the soft label matrix for unlabeled data.
Specifically, we employ hypergraph manifold regularization to keep the relations among original data, transformed data, and soft labels consistent.
arXiv Detail & Related papers (2020-10-23T14:07:07Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.