Attribute-Aware Representation Rectification for Generalized Zero-Shot
Learning
- URL: http://arxiv.org/abs/2311.14750v2
- Date: Fri, 1 Dec 2023 15:25:35 GMT
- Title: Attribute-Aware Representation Rectification for Generalized Zero-Shot
Learning
- Authors: Zhijie Rao, Jingcai Guo, Xiaocheng Lu, Qihua Zhou, Jie Zhang, Kang
Wei, Chenxin Li, Song Guo
- Abstract summary: Generalized Zero-shot Learning (GZSL) has yielded remarkable performance by designing a series of unbiased visual-semantics mappings.
We propose a simple yet effective Attribute-Aware Representation Rectification framework for GZSL, dubbed $mathbf(AR)2$.
- Score: 19.65026043141699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalized Zero-shot Learning (GZSL) has yielded remarkable performance by
designing a series of unbiased visual-semantics mappings, wherein, the
precision relies heavily on the completeness of extracted visual features from
both seen and unseen classes. However, as a common practice in GZSL, the
pre-trained feature extractor may easily exhibit difficulty in capturing
domain-specific traits of the downstream tasks/datasets to provide fine-grained
discriminative features, i.e., domain bias, which hinders the overall
recognition performance, especially for unseen classes. Recent studies
partially address this issue by fine-tuning feature extractors, while may
inevitably incur catastrophic forgetting and overfitting issues. In this paper,
we propose a simple yet effective Attribute-Aware Representation Rectification
framework for GZSL, dubbed $\mathbf{(AR)^{2}}$, to adaptively rectify the
feature extractor to learn novel features while keeping original valuable
features. Specifically, our method consists of two key components, i.e.,
Unseen-Aware Distillation (UAD) and Attribute-Guided Learning (AGL). During
training, UAD exploits the prior knowledge of attribute texts that are shared
by both seen/unseen classes with attention mechanisms to detect and maintain
unseen class-sensitive visual features in a targeted manner, and meanwhile, AGL
aims to steer the model to focus on valuable features and suppress them to fit
noisy elements in the seen classes by attribute-guided representation learning.
Extensive experiments on various benchmark datasets demonstrate the
effectiveness of our method.
Related papers
- CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning [48.46511584490582]
Zero-shot learning (ZSL) enables the recognition of novel classes by leveraging semantic knowledge transfer from known to unknown categories.
Real-world challenges such as distribution imbalances and attribute co-occurrence hinder the discernment of local variances in images.
We propose a bidirectional cross-modal ZSL approach CREST to overcome these challenges.
arXiv Detail & Related papers (2024-04-15T10:19:39Z) - High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - Dual Feature Augmentation Network for Generalized Zero-shot Learning [14.410978100610489]
Zero-shot learning (ZSL) aims to infer novel classes without training samples by transferring knowledge from seen classes.
Existing embedding-based approaches for ZSL typically employ attention mechanisms to locate attributes on an image.
We propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules.
arXiv Detail & Related papers (2023-09-25T02:37:52Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Exploiting Semantic Attributes for Transductive Zero-Shot Learning [97.61371730534258]
Zero-shot learning aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes.
We present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process.
Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
arXiv Detail & Related papers (2023-03-17T09:09:48Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - Learning Invariant Visual Representations for Compositional Zero-Shot
Learning [30.472541551048508]
Compositional Zero-Shot Learning (CZSL) aims to recognize novel compositions using knowledge learned from seen-object compositions in the training set.
We propose an invariant feature learning framework to align different domains at the representation and gradient levels.
Experiments on two CZSL benchmarks demonstrate that the proposed method significantly outperforms the previous state-of-the-art.
arXiv Detail & Related papers (2022-06-01T11:33:33Z) - FREE: Feature Refinement for Generalized Zero-Shot Learning [86.41074134041394]
Generalized zero-shot learning (GZSL) has achieved significant progress, with many efforts dedicated to overcoming the problems of visual-semantic domain gap and seen-unseen bias.
Most existing methods directly use feature extraction models trained on ImageNet alone, ignoring the cross-dataset bias between ImageNet and GZSL benchmarks.
We propose a simple yet effective GZSL method, termed feature refinement for generalized zero-shot learning (FREE) to tackle the above problem.
arXiv Detail & Related papers (2021-07-29T08:11:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.