Attribute Localization and Revision Network for Zero-Shot Learning
- URL: http://arxiv.org/abs/2310.07548v1
- Date: Wed, 11 Oct 2023 14:50:52 GMT
- Title: Attribute Localization and Revision Network for Zero-Shot Learning
- Authors: Junzhe Xu, Suling Duan, Chenwei Tang, Zhenan He, Jiancheng Lv
- Abstract summary: Zero-shot learning enables the model to recognize unseen categories with the aid of auxiliary semantic information such as attributes.
In this paper, we find that the choice between local and global features is not a zero-sum game, global features can also contribute to the understanding of attributes.
- Score: 13.530912616208722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot learning enables the model to recognize unseen categories with the
aid of auxiliary semantic information such as attributes. Current works
proposed to detect attributes from local image regions and align extracted
features with class-level semantics. In this paper, we find that the choice
between local and global features is not a zero-sum game, global features can
also contribute to the understanding of attributes. In addition, aligning
attribute features with class-level semantics ignores potential intra-class
attribute variation. To mitigate these disadvantages, we present Attribute
Localization and Revision Network in this paper. First, we design Attribute
Localization Module (ALM) to capture both local and global features from image
regions, a novel module called Scale Control Unit is incorporated to fuse
global and local representations. Second, we propose Attribute Revision Module
(ARM), which generates image-level semantics by revising the ground-truth value
of each attribute, compensating for performance degradation caused by ignoring
intra-class variation. Finally, the output of ALM will be aligned with revised
semantics produced by ARM to achieve the training process. Comprehensive
experimental results on three widely used benchmarks demonstrate the
effectiveness of our model in the zero-shot prediction task.
Related papers
- Adaptive Global-Local Representation Learning and Selection for
Cross-Domain Facial Expression Recognition [54.334773598942775]
Domain shift poses a significant challenge in Cross-Domain Facial Expression Recognition (CD-FER)
We propose an Adaptive Global-Local Representation Learning and Selection framework.
arXiv Detail & Related papers (2024-01-20T02:21:41Z) - Dual Feature Augmentation Network for Generalized Zero-shot Learning [14.410978100610489]
Zero-shot learning (ZSL) aims to infer novel classes without training samples by transferring knowledge from seen classes.
Existing embedding-based approaches for ZSL typically employ attention mechanisms to locate attributes on an image.
We propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules.
arXiv Detail & Related papers (2023-09-25T02:37:52Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - Region Semantically Aligned Network for Zero-Shot Learning [18.18665627472823]
We propose a Region Semantically Aligned Network (RSAN) which maps local features of unseen classes to their semantic attributes.
We obtain each attribute from a specific region of the output and exploit these attributes for recognition.
Experiments on several standard ZSL datasets reveal the benefit of the proposed RSAN method, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2021-10-14T03:23:40Z) - Goal-Oriented Gaze Estimation for Zero-Shot Learning [62.52340838817908]
We introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization.
We aim to predict the actual human gaze location to get the visual attention regions for recognizing a novel object guided by attribute description.
This work implies the promising benefits of collecting human gaze dataset and automatic gaze estimation algorithms on high-level computer vision tasks.
arXiv Detail & Related papers (2021-03-05T02:14:57Z) - Attribute Prototype Network for Zero-Shot Learning [113.50220968583353]
We propose a novel zero-shot representation learning framework that jointly learns discriminative global and local features.
Our model points to the visual evidence of the attributes in an image, confirming the improved attribute localization ability of our image representation.
arXiv Detail & Related papers (2020-08-19T06:46:35Z) - Simple and effective localized attribute representations for zero-shot
learning [48.053204004771665]
Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their semantic descriptions.
We propose localizing representations in the semantic/attribute space, with a simple but effective pipeline where localization is implicit.
Our method can be implemented easily, which can be used as a new baseline for zero shot-learning.
arXiv Detail & Related papers (2020-06-10T16:46:12Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.