Attribute Prototype Network for Any-Shot Learning
- URL: http://arxiv.org/abs/2204.01208v1
- Date: Mon, 4 Apr 2022 02:25:40 GMT
- Title: Attribute Prototype Network for Any-Shot Learning
- Authors: Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
- Abstract summary: We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
- Score: 113.50220968583353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Any-shot image classification allows to recognize novel classes with only a
few or even zero samples. For the task of zero-shot learning, visual attributes
have been shown to play an important role, while in the few-shot regime, the
effect of attributes is under-explored. To better transfer attribute-based
knowledge from seen to unseen classes, we argue that an image representation
with integrated attribute localization ability would be beneficial for
any-shot, i.e. zero-shot and few-shot, image classification tasks. To this end,
we propose a novel representation learning framework that jointly learns
discriminative global and local features using only class-level attributes.
While a visual-semantic embedding layer learns global features, local features
are learned through an attribute prototype network that simultaneously
regresses and decorrelates attributes from intermediate features. Furthermore,
we introduce a zoom-in module that localizes and crops the informative regions
to encourage the network to learn informative features explicitly. We show that
our locality augmented image representations achieve a new state-of-the-art on
challenging benchmarks, i.e. CUB, AWA2, and SUN. As an additional benefit, our
model points to the visual evidence of the attributes in an image, confirming
the improved attribute localization ability of our image representation. The
attribute localization is evaluated quantitatively with ground truth part
annotations, qualitatively with visualizations, and through well-designed user
studies.
Related papers
- High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - Attribute Localization and Revision Network for Zero-Shot Learning [13.530912616208722]
Zero-shot learning enables the model to recognize unseen categories with the aid of auxiliary semantic information such as attributes.
In this paper, we find that the choice between local and global features is not a zero-sum game, global features can also contribute to the understanding of attributes.
arXiv Detail & Related papers (2023-10-11T14:50:52Z) - Dual Feature Augmentation Network for Generalized Zero-shot Learning [14.410978100610489]
Zero-shot learning (ZSL) aims to infer novel classes without training samples by transferring knowledge from seen classes.
Existing embedding-based approaches for ZSL typically employ attention mechanisms to locate attributes on an image.
We propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules.
arXiv Detail & Related papers (2023-09-25T02:37:52Z) - Shaping Visual Representations with Attributes for Few-Shot Learning [5.861206243996454]
Few-shot recognition aims to recognize novel categories under low-data regimes.
Recent metric-learning based few-shot learning methods have achieved promising performances.
We propose attribute-shaped learning (ASL), which can normalize visual representations to predict attributes for query images.
arXiv Detail & Related papers (2021-12-13T03:16:19Z) - Region Semantically Aligned Network for Zero-Shot Learning [18.18665627472823]
We propose a Region Semantically Aligned Network (RSAN) which maps local features of unseen classes to their semantic attributes.
We obtain each attribute from a specific region of the output and exploit these attributes for recognition.
Experiments on several standard ZSL datasets reveal the benefit of the proposed RSAN method, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2021-10-14T03:23:40Z) - Goal-Oriented Gaze Estimation for Zero-Shot Learning [62.52340838817908]
We introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization.
We aim to predict the actual human gaze location to get the visual attention regions for recognizing a novel object guided by attribute description.
This work implies the promising benefits of collecting human gaze dataset and automatic gaze estimation algorithms on high-level computer vision tasks.
arXiv Detail & Related papers (2021-03-05T02:14:57Z) - Attribute Prototype Network for Zero-Shot Learning [113.50220968583353]
We propose a novel zero-shot representation learning framework that jointly learns discriminative global and local features.
Our model points to the visual evidence of the attributes in an image, confirming the improved attribute localization ability of our image representation.
arXiv Detail & Related papers (2020-08-19T06:46:35Z) - Simple and effective localized attribute representations for zero-shot
learning [48.053204004771665]
Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their semantic descriptions.
We propose localizing representations in the semantic/attribute space, with a simple but effective pipeline where localization is implicit.
Our method can be implemented easily, which can be used as a new baseline for zero shot-learning.
arXiv Detail & Related papers (2020-06-10T16:46:12Z) - CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language
Learning [78.3857991931479]
We present GROLLA, an evaluation framework for Grounded Language Learning with Attributes.
We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations.
arXiv Detail & Related papers (2020-06-03T11:21:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.