Learning Discriminative Representations for Multi-Label Image
Recognition
- URL: http://arxiv.org/abs/2107.11159v1
- Date: Fri, 23 Jul 2021 12:10:46 GMT
- Title: Learning Discriminative Representations for Multi-Label Image
Recognition
- Authors: Mohammed Hassanin, Ibrahim Radwan, Salman Khan, Murat Tahtali
- Abstract summary: We propose a unified deep network to learn discriminative features for the multi-label task.
By regularizing the whole network with the proposed loss, the performance of applying the wellknown ResNet-101 is improved significantly.
- Score: 13.13795708478267
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-label recognition is a fundamental, and yet is a challenging task in
computer vision. Recently, deep learning models have achieved great progress
towards learning discriminative features from input images. However,
conventional approaches are unable to model the inter-class discrepancies among
features in multi-label images, since they are designed to work for image-level
feature discrimination. In this paper, we propose a unified deep network to
learn discriminative features for the multi-label task. Given a multi-label
image, the proposed method first disentangles features corresponding to
different classes. Then, it discriminates between these classes via increasing
the inter-class distance while decreasing the intra-class differences in the
output space. By regularizing the whole network with the proposed loss, the
performance of applying the wellknown ResNet-101 is improved significantly.
Extensive experiments have been performed on COCO-2014, VOC2007 and VOC2012
datasets, which demonstrate that the proposed method outperforms
state-of-the-art approaches by a significant margin of 3:5% on large-scale COCO
dataset. Moreover, analysis of the discriminative feature learning approach
shows that it can be plugged into various types of multi-label methods as a
general module.
Related papers
- Multi-label Cluster Discrimination for Visual Representation Learning [27.552024985952166]
We propose a novel Multi-Label Cluster Discrimination method named MLCD to enhance representation learning.
Our method achieves state-of-the-art performance on multiple downstream tasks including linear probe, zero-shot classification, and image-text retrieval.
arXiv Detail & Related papers (2024-07-24T14:54:16Z) - High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - DiverseNet: Decision Diversified Semi-supervised Semantic Segmentation Networks for Remote Sensing Imagery [17.690698736544626]
We propose DiverseNet which explores multi-head and multi-model semi-supervised learning algorithms by simultaneously enhancing precision and diversity during training.
The two proposed methods in the DiverseNet family, namely DiverseHead and DiverseModel, both achieve the better semantic segmentation performance in four widely utilised remote sensing imagery data sets.
arXiv Detail & Related papers (2023-11-22T22:20:10Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Reliable Representations Learning for Incomplete Multi-View Partial Multi-Label Classification [78.15629210659516]
In this paper, we propose an incomplete multi-view partial multi-label classification network named RANK.
We break through the view-level weights inherent in existing methods and propose a quality-aware sub-network to dynamically assign quality scores to each view of each sample.
Our model is not only able to handle complete multi-view multi-label datasets, but also works on datasets with missing instances and labels.
arXiv Detail & Related papers (2023-03-30T03:09:25Z) - Discriminative Feature Learning through Feature Distance Loss [0.0]
This work proposes a novel method that combines variant rich base models to concentrate on different important image regions for classification.
Experiments on benchmark convolutional neural networks (VGG16, ResNet, AlexNet), popular datasets (Cifar10, Cifar100, miniImageNet, NEU, BSD, TEX) show our methods effectiveness and generalization ability.
arXiv Detail & Related papers (2022-05-23T20:01:32Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Multi-Label Image Classification with Contrastive Learning [57.47567461616912]
We show that a direct application of contrastive learning can hardly improve in multi-label cases.
We propose a novel framework for multi-label classification with contrastive learning in a fully supervised setting.
arXiv Detail & Related papers (2021-07-24T15:00:47Z) - Learning to Focus: Cascaded Feature Matching Network for Few-shot Image
Recognition [38.49419948988415]
Deep networks can learn to accurately recognize objects of a category by training on a large number of images.
A meta-learning challenge known as a low-shot image recognition task comes when only a few images with annotations are available for learning a recognition model for one category.
Our method, called Cascaded Feature Matching Network (CFMN), is proposed to solve this problem.
Experiments for few-shot learning on two standard datasets, emphminiImageNet and Omniglot, have confirmed the effectiveness of our method.
arXiv Detail & Related papers (2021-01-13T11:37:28Z) - ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image
Classification [49.87503122462432]
We introduce a novel neural network termed Relation-and-Margin learning Network (ReMarNet)
Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms.
Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples.
arXiv Detail & Related papers (2020-06-27T13:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.