Related papers: Combining Metric Learning and Attention Heads For Accurate and Efficient Multilabel Image Classification

Combining Metric Learning and Attention Heads For Accurate and Efficient Multilabel Image Classification

URL: http://arxiv.org/abs/2209.06585v1
Date: Wed, 14 Sep 2022 12:06:47 GMT
Title: Combining Metric Learning and Attention Heads For Accurate and Efficient Multilabel Image Classification
Authors: Kirill Prokofiev and Vladislav Sovrasov
Abstract summary: We revisit two popular approaches to multilabel classification: transformer-based heads and labels relations information graph processing branches. Although transformer-based heads are considered to achieve better results than graph-based branches, we argue that with the proper training strategy graph-based methods can demonstrate just a small accuracy drop.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-label image classification allows predicting a set of labels from a given image. Unlike multiclass classification, where only one label per image is assigned, such setup is applicable for a broader range of applications. In this work we revisit two popular approaches to multilabel classification: transformer-based heads and labels relations information graph processing branches. Although transformer-based heads are considered to achieve better results than graph-based branches, we argue that with the proper training strategy graph-based methods can demonstrate just a small accuracy drop, while spending less computational resources on inference. In our training strategy, instead of Asymmetric Loss (ASL), which is the de-facto standard for multilabel classification, we introduce its modification acting in the angle space. It implicitly learns a proxy feature vector on the unit hypersphere for each class, providing a better discrimination ability, than binary cross entropy loss does on unnormalized features. With the proposed loss and training strategy, we obtain SOTA results among single modality methods on widespread multilabel classification benchmarks such as MS-COCO, PASCAL-VOC, NUS-Wide and Visual Genome 500. Source code of our method is available as a part of the OpenVINO Training Extensions https://github.com/openvinotoolkit/deep-object-reid/tree/multilabel

Related papers

Dual-level Fuzzy Learning with Patch Guidance for Image Ordinal Regression [8.538034422744005]
Ordinal regression bridges regression and classification by assigning objects to ordered classes.<n>Current approaches are limited by the availability of only image-level ordinal labels.<n>We propose a Dual-level Fuzzy Learning with Patch Guidance framework, named DFPG.
arXiv Detail & Related papers (2025-05-09T07:01:14Z)
Multi-label Cluster Discrimination for Visual Representation Learning [27.552024985952166]
We propose a novel Multi-Label Cluster Discrimination method named MLCD to enhance representation learning. Our method achieves state-of-the-art performance on multiple downstream tasks including linear probe, zero-shot classification, and image-text retrieval.
arXiv Detail & Related papers (2024-07-24T14:54:16Z)
UniDEC : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification [42.36546066941635]
Extreme Multi-label Classification (XMC) involves predicting a subset of relevant labels from an extremely large label space. This work proposes UniDEC, a novel end-to-end trainable framework which trains the dual encoder and classifier in together.
arXiv Detail & Related papers (2024-05-04T17:27:51Z)
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training [29.431698321195814]
Contrastive Language-Image Pre-training (CLIP) has demonstrated impressive capabilities in open-vocabulary classification. CLIP shows poor performance on multi-label datasets because the global feature tends to be dominated by the most prominent class. We propose a local-to-global framework to obtain image tags.
arXiv Detail & Related papers (2023-12-20T08:15:40Z)
Graph Attention Transformer Network for Multi-Label Image Classification [50.0297353509294]
We propose a general framework for multi-label image classification that can effectively mine complex inter-label relationships. Our proposed methods can achieve state-of-the-art performance on three datasets.
arXiv Detail & Related papers (2022-03-08T12:39:05Z)
PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes. We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training. Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z)
All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training [32.45488147013166]
Pseudo-labeling is a key component in semi-supervised learning (SSL) We propose SemCo, a method which leverages label semantics and co-training to address this problem. We show that our method achieves state-of-the-art performance across various SSL tasks including 5.6% accuracy improvement on Mini-ImageNet dataset with 1000 labeled examples.
arXiv Detail & Related papers (2021-04-12T07:33:16Z)
Generative Multi-Label Zero-Shot Learning [136.17594611722285]
Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. Our work is the first to tackle the problem of multi-label feature in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on all three datasets.
arXiv Detail & Related papers (2021-01-27T18:56:46Z)
Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels. By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z)
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks. It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence. Then, it introduces the label semantics to guide learning semantic-specific features. It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z)
Meta Learning for Few-Shot One-class Classification [0.0]
We formulate the learning of meaningful features for one-class classification as a meta-learning problem. To learn these representations, we require only multiclass data from similar tasks. We validate our approach by adapting few-shot classification datasets to the few-shot one-class classification scenario.
arXiv Detail & Related papers (2020-09-11T11:35:28Z)
Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge [36.04579549557464]
Multi-label zero-shot classification aims to predict multiple unseen class labels for an input image. This paper introduces a novel multi-label zero-shot classification framework by learning to transfer from external knowledge.
arXiv Detail & Related papers (2020-07-30T17:26:46Z)
Unsupervised Person Re-identification via Multi-label Classification [55.65870468861157]
This paper formulates unsupervised person ReID as a multi-label classification task to progressively seek true labels. Our method starts by assigning each person image with a single-class label, then evolves to multi-label classification by leveraging the updated ReID model for label prediction. To boost the ReID model training efficiency in multi-label classification, we propose the memory-based multi-label classification loss (MMCL)
arXiv Detail & Related papers (2020-04-20T12:13:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.