Bias-Eliminated Semantic Refinement for Any-Shot Learning
- URL: http://arxiv.org/abs/2202.04827v1
- Date: Thu, 10 Feb 2022 04:15:50 GMT
- Title: Bias-Eliminated Semantic Refinement for Any-Shot Learning
- Authors: Liangjun Feng, Chunhui Zhao, and Xi Li
- Abstract summary: We refine the coarse-grained semantic description for any-shot learning tasks.
A new model, namely, the semantic refinement Wasserstein generative adversarial network (SRWGAN) model, is designed.
We extensively evaluate model performance on six benchmark datasets.
- Score: 27.374052527155623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When training samples are scarce, the semantic embedding technique, ie,
describing class labels with attributes, provides a condition to generate
visual features for unseen objects by transferring the knowledge from seen
objects. However, semantic descriptions are usually obtained in an external
paradigm, such as manual annotation, resulting in weak consistency between
descriptions and visual features. In this paper, we refine the coarse-grained
semantic description for any-shot learning tasks, ie, zero-shot learning (ZSL),
generalized zero-shot learning (GZSL), and few-shot learning (FSL). A new
model, namely, the semantic refinement Wasserstein generative adversarial
network (SRWGAN) model, is designed with the proposed multihead representation
and hierarchical alignment techniques. Unlike conventional methods, semantic
refinement is performed with the aim of identifying a bias-eliminated condition
for disjoint-class feature generation and is applicable in both inductive and
transductive settings. We extensively evaluate model performance on six
benchmark datasets and observe state-of-the-art results for any-shot learning;
eg, we obtain 70.2% harmonic accuracy for the Caltech UCSD Birds (CUB) dataset
and 82.2% harmonic accuracy for the Oxford Flowers (FLO) dataset in the
standard GZSL setting. Various visualizations are also provided to show the
bias-eliminated generation of SRWGAN. Our code is available.
Related papers
- SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized
Zero-Shot Learning [0.7420433640907689]
Generalized Zero-Shot Learning (GZSL) recognizes unseen classes by transferring knowledge from the seen classes.
This paper introduces a dual strategy to address the generalization gap.
arXiv Detail & Related papers (2023-12-20T15:18:51Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Exploiting Semantic Attributes for Transductive Zero-Shot Learning [97.61371730534258]
Zero-shot learning aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes.
We present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process.
Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
arXiv Detail & Related papers (2023-03-17T09:09:48Z) - Resolving Semantic Confusions for Improved Zero-Shot Detection [6.72910827751713]
We propose a generative model incorporating a triplet loss that acknowledges the degree of dissimilarity between classes.
A cyclic-consistency loss is also enforced to ensure that generated visual samples of a class highly correspond to their own semantics.
arXiv Detail & Related papers (2022-12-12T18:11:48Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning [37.48292304239107]
We present a transformer-based end-to-end ZSL method named DUET.
We develop a cross-modal semantic grounding network to investigate the model's capability of disentangling semantic attributes from the images.
We find that DUET can often achieve state-of-the-art performance, its components are effective and its predictions are interpretable.
arXiv Detail & Related papers (2022-07-04T11:12:12Z) - Cross-modal Representation Learning for Zero-shot Action Recognition [67.57406812235767]
We present a cross-modal Transformer-based framework, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR)
Our model employs a conceptually new pipeline by which visual representations are learned in conjunction with visual-semantic associations in an end-to-end manner.
Experiment results show our model considerably improves upon the state of the arts in ZSAR, reaching encouraging top-1 accuracy on UCF101, HMDB51, and ActivityNet benchmark datasets.
arXiv Detail & Related papers (2022-05-03T17:39:27Z) - FREE: Feature Refinement for Generalized Zero-Shot Learning [86.41074134041394]
Generalized zero-shot learning (GZSL) has achieved significant progress, with many efforts dedicated to overcoming the problems of visual-semantic domain gap and seen-unseen bias.
Most existing methods directly use feature extraction models trained on ImageNet alone, ignoring the cross-dataset bias between ImageNet and GZSL benchmarks.
We propose a simple yet effective GZSL method, termed feature refinement for generalized zero-shot learning (FREE) to tackle the above problem.
arXiv Detail & Related papers (2021-07-29T08:11:01Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Generative Model-driven Structure Aligning Discriminative Embeddings for
Transductive Zero-shot Learning [21.181715602603436]
We propose a neural network-based model for learning a projection function which aligns the visual and semantic data in the latent space.
We show superior performance on standard benchmark datasets AWA1, AWA2, CUB, SUN, FLO, and.
We also show the efficacy of our model in the case of extremely less labelled data regime.
arXiv Detail & Related papers (2020-05-09T18:48:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.