Semantic Diversity Learning for Zero-Shot Multi-label Classification
- URL: http://arxiv.org/abs/2105.05926v1
- Date: Wed, 12 May 2021 19:39:07 GMT
- Title: Semantic Diversity Learning for Zero-Shot Multi-label Classification
- Authors: Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Baruch, Itamar Friedman, Lihi
Zelnik-Manor
- Abstract summary: This study introduces an end-to-end model training for multi-label zero-shot learning.
We propose to use an embedding matrix having principal embedding vectors trained using a tailored loss function.
In addition, during training, we suggest up-weighting in the loss function image samples presenting higher semantic diversity to encourage the diversity of the embedding matrix.
- Score: 14.480713752871523
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training a neural network model for recognizing multiple labels associated
with an image, including identifying unseen labels, is challenging, especially
for images that portray numerous semantically diverse labels. As challenging as
this task is, it is an essential task to tackle since it represents many
real-world cases, such as image retrieval of natural images. We argue that
using a single embedding vector to represent an image, as commonly practiced,
is not sufficient to rank both relevant seen and unseen labels accurately. This
study introduces an end-to-end model training for multi-label zero-shot
learning that supports semantic diversity of the images and labels. We propose
to use an embedding matrix having principal embedding vectors trained using a
tailored loss function. In addition, during training, we suggest up-weighting
in the loss function image samples presenting higher semantic diversity to
encourage the diversity of the embedding matrix. Extensive experiments show
that our proposed method improves the zero-shot model's quality in tag-based
image retrieval achieving SoTA results on several common datasets (NUS-Wide,
COCO, Open Images).
Related papers
- Towards Effective Multi-Label Recognition Attacks via Knowledge Graph
Consistency [33.250544869840155]
We show that the naive extensions of multi-class attacks to the multi-label setting lead to violating label relationships.
We propose a graph-consistent multi-label attack framework, which searches for small image perturbations that lead to misclassifying a desired target set.
arXiv Detail & Related papers (2022-07-11T19:08:32Z) - Dual-Perspective Semantic-Aware Representation Blending for Multi-Label
Image Recognition with Partial Labels [70.36722026729859]
We propose a dual-perspective semantic-aware representation blending (DSRB) that blends multi-granularity category-specific semantic representation across different images.
The proposed DS consistently outperforms current state-of-the-art algorithms on all proportion label settings.
arXiv Detail & Related papers (2022-05-26T00:33:44Z) - Multi-Label Image Classification with Contrastive Learning [57.47567461616912]
We show that a direct application of contrastive learning can hardly improve in multi-label cases.
We propose a novel framework for multi-label classification with contrastive learning in a fully supervised setting.
arXiv Detail & Related papers (2021-07-24T15:00:47Z) - Mixed Supervision Learning for Whole Slide Image Classification [88.31842052998319]
We propose a mixed supervision learning framework for super high-resolution images.
During the patch training stage, this framework can make use of coarse image-level labels to refine self-supervised learning.
A comprehensive strategy is proposed to suppress pixel-level false positives and false negatives.
arXiv Detail & Related papers (2021-07-02T09:46:06Z) - Multi-layered Semantic Representation Network for Multi-label Image
Classification [8.17894017454724]
Multi-label image classification (MLIC) is a fundamental and practical task, which aims to assign multiple possible labels to an image.
In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations.
This paper advances this research direction by improving the modeling of label correlations and the learning of semantic representations.
arXiv Detail & Related papers (2021-06-22T08:04:22Z) - Multi-Label Learning from Single Positive Labels [37.17676289125165]
Predicting all applicable labels for a given image is known as multi-label classification.
We show that it is possible to approach the performance of fully labeled classifiers despite training with significantly fewer confirmed labels.
arXiv Detail & Related papers (2021-06-17T17:58:04Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - SSKD: Self-Supervised Knowledge Distillation for Cross Domain Adaptive
Person Re-Identification [25.96221714337815]
Domain adaptive person re-identification (re-ID) is a challenging task due to the large discrepancy between the source domain and the target domain.
Existing methods mainly attempt to generate pseudo labels for unlabeled target images by clustering algorithms.
We propose a Self-Supervised Knowledge Distillation (SSKD) technique containing two modules, the identity learning and the soft label learning.
arXiv Detail & Related papers (2020-09-13T10:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.