Zero-Shot Recognition through Image-Guided Semantic Classification
- URL: http://arxiv.org/abs/2007.11814v1
- Date: Thu, 23 Jul 2020 06:22:40 GMT
- Title: Zero-Shot Recognition through Image-Guided Semantic Classification
- Authors: Mei-Chen Yeh and Fang Li
- Abstract summary: We present a new embedding-based framework for zero-shot learning (ZSL)
Motivated by the binary relevance method for multi-label classification, we propose to inversely learn the mapping between an image and a semantic classifier.
IGSC is conceptually simple and can be realized by a slight enhancement of an existing deep architecture for classification.
- Score: 9.291055558504588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new embedding-based framework for zero-shot learning (ZSL). Most
embedding-based methods aim to learn the correspondence between an image
classifier (visual representation) and its class prototype (semantic
representation) for each class. Motivated by the binary relevance method for
multi-label classification, we propose to inversely learn the mapping between
an image and a semantic classifier. Given an input image, the proposed
Image-Guided Semantic Classification (IGSC) method creates a label classifier,
being applied to all label embeddings to determine whether a label belongs to
the input image. Therefore, semantic classifiers are image-adaptive and are
generated during inference. IGSC is conceptually simple and can be realized by
a slight enhancement of an existing deep architecture for classification; yet
it is effective and outperforms state-of-the-art embedding-based generalized
ZSL approaches on standard benchmarks.
Related papers
- CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image
Classification [23.392746466420128]
This paper presents a CLIP-based unsupervised learning method for annotation-free multi-label image classification.
We take full advantage of the powerful CLIP model and propose a novel approach to extend CLIP for multi-label predictions based on global-local image-text similarity aggregation.
Our method outperforms state-of-the-art unsupervised methods on MS-COCO, PASCAL VOC 2007, PASCAL VOC 2012, and NUS datasets.
arXiv Detail & Related papers (2023-07-31T13:12:02Z) - Semantic-Aware Dual Contrastive Learning for Multi-label Image
Classification [8.387933969327852]
We propose a novel semantic-aware dual contrastive learning framework that incorporates sample-to-sample contrastive learning.
Specifically, we leverage semantic-aware representation learning to extract category-related local discriminative features.
Our proposed method is effective and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2023-07-19T01:57:31Z) - Language-driven Semantic Segmentation [88.21498323896475]
We present LSeg, a novel model for language-driven semantic image segmentation.
We use a text encoder to compute embeddings of descriptive input labels.
The encoder is trained with a contrastive objective to align pixel embeddings to the text embedding of the corresponding semantic class.
arXiv Detail & Related papers (2022-01-10T18:59:10Z) - A Simple Approach for Zero-Shot Learning based on Triplet Distribution
Embeddings [6.193231258199234]
ZSL aims to recognize unseen classes without labeled training data by exploiting semantic information.
Existing ZSL methods mainly use vectors to represent the embeddings to the semantic space.
We address this issue by leveraging the use of distribution embeddings.
arXiv Detail & Related papers (2021-03-29T20:26:20Z) - Seed the Views: Hierarchical Semantic Alignment for Contrastive
Representation Learning [116.91819311885166]
We propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to textbfCross-samples and Multi-level representation.
Our method, termed as CsMl, has the ability to integrate multi-level visual representations across samples in a robust way.
arXiv Detail & Related papers (2020-12-04T17:26:24Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Causal Intervention for Weakly-Supervised Semantic Segmentation [122.1846968696862]
We aim to generate better pixel-level pseudo-masks by using only image-level labels.
We propose a structural causal model to analyze the causalities among images, contexts, and class labels.
Based on it, we develop a new method: Context Adjustment (CONTA), to remove the confounding bias in image-level classification.
arXiv Detail & Related papers (2020-09-26T09:26:29Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z) - Learning Representations For Images With Hierarchical Labels [1.3579420996461438]
We present a set of methods to leverage information about the semantic hierarchy induced by class labels.
We show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
Although, both the CNN-classifiers injected with hierarchical information, and the embedding-based models outperform a hierarchy-agnostic model on the newly presented, real-world ETH Entomological Collection image dataset.
arXiv Detail & Related papers (2020-04-02T09:56:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.