Graph Attention Transformer Network for Multi-Label Image Classification
- URL: http://arxiv.org/abs/2203.04049v2
- Date: Mon, 15 Jan 2024 10:44:57 GMT
- Title: Graph Attention Transformer Network for Multi-Label Image Classification
- Authors: Jin Yuan, Shikai Chen, Yao Zhang, Zhongchao Shi, Xin Geng, Jianping
Fan, Yong Rui
- Abstract summary: We propose a general framework for multi-label image classification that can effectively mine complex inter-label relationships.
Our proposed methods can achieve state-of-the-art performance on three datasets.
- Score: 50.0297353509294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label classification aims to recognize multiple objects or attributes
from images. However, it is challenging to learn from proper label graphs to
effectively characterize such inter-label correlations or dependencies. Current
methods often use the co-occurrence probability of labels based on the training
set as the adjacency matrix to model this correlation, which is greatly limited
by the dataset and affects the model's generalization ability. In this paper,
we propose a Graph Attention Transformer Network (GATN), a general framework
for multi-label image classification that can effectively mine complex
inter-label relationships. First, we use the cosine similarity based on the
label word embedding as the initial correlation matrix, which can represent
rich semantic information. Subsequently, we design the graph attention
transformer layer to transfer this adjacency matrix to adapt to the current
domain. Our extensive experiments have demonstrated that our proposed methods
can achieve state-of-the-art performance on three datasets.
Related papers
- Semantic-Aware Graph Matching Mechanism for Multi-Label Image
Recognition [21.36538164675385]
Multi-label image recognition aims to predict a set of labels that present in an image.
In this paper, we treat each image as a bag of instances, and formulate the task of multi-label image recognition as an instance-label matching selection problem.
We propose an innovative Semantic-aware Graph Matching framework for Multi-Label image recognition (ML-SGM)
arXiv Detail & Related papers (2023-04-21T23:48:01Z) - Multi-label Classification with High-rank and High-order Label
Correlations [62.39748565407201]
Previous methods capture the high-order label correlations mainly by transforming the label matrix to a latent label space with low-rank matrix factorization.
We propose a simple yet effective method to depict the high-order label correlations explicitly, and at the same time maintain the high-rank of the label matrix.
Comparative studies over twelve benchmark data sets validate the effectiveness of the proposed algorithm in multi-label classification.
arXiv Detail & Related papers (2022-07-09T05:15:31Z) - Heterogeneous Semantic Transfer for Multi-label Recognition with Partial Labels [70.45813147115126]
Multi-label image recognition with partial labels (MLR-PL) may greatly reduce the cost of annotation and thus facilitate large-scale MLR.
We find that strong semantic correlations exist within each image and across different images.
These correlations can help transfer the knowledge possessed by the known labels to retrieve the unknown labels.
arXiv Detail & Related papers (2022-05-23T08:37:38Z) - General Multi-label Image Classification with Transformers [30.58248625606648]
We propose the Classification Transformer (C-Tran) to exploit the complex dependencies among visual features and labels.
A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels.
Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome.
arXiv Detail & Related papers (2020-11-27T23:20:35Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - Instance-Aware Graph Convolutional Network for Multi-Label
Classification [55.131166957803345]
Graph convolutional neural network (GCN) has effectively boosted the multi-label image recognition task.
We propose an instance-aware graph convolutional neural network (IA-GCN) framework for multi-label classification.
arXiv Detail & Related papers (2020-08-19T12:49:28Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z) - Multi-Label Text Classification using Attention-based Graph Neural
Network [0.0]
A graph attention network-based model is proposed to capture the attentive dependency structure among the labels.
The proposed model achieves similar or better performance compared to the previous state-of-the-art models.
arXiv Detail & Related papers (2020-03-22T17:12:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.