Mitigating Generation Shifts for Generalized Zero-Shot Learning
- URL: http://arxiv.org/abs/2107.03163v1
- Date: Wed, 7 Jul 2021 11:43:59 GMT
- Title: Mitigating Generation Shifts for Generalized Zero-Shot Learning
- Authors: Zhi Chen, Yadan Luo, Sen Wang, Ruihong Qiu, Jingjing Li, Zi Huang
- Abstract summary: Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
- Score: 52.98182124310114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic
information (e.g., attributes) to recognize the seen and unseen samples, where
unseen classes are not observable during training. It is natural to derive
generative models and hallucinate training samples for unseen classes based on
the knowledge learned from the seen samples. However, most of these models
suffer from the `generation shifts', where the synthesized samples may drift
from the real distribution of unseen data. In this paper, we conduct an
in-depth analysis on this issue and propose a novel Generation Shifts
Mitigating Flow (GSMFlow) framework, which is comprised of multiple conditional
affine coupling layers for learning unseen data synthesis efficiently and
effectively. In particular, we identify three potential problems that trigger
the generation shifts, i.e., semantic inconsistency, variance decay, and
structural permutation and address them respectively. First, to reinforce the
correlations between the generated samples and the respective attributes, we
explicitly embed the semantic information into the transformations in each of
the coupling layers. Second, to recover the intrinsic variance of the
synthesized unseen features, we introduce a visual perturbation strategy to
diversify the intra-class variance of generated data and hereby help adjust the
decision boundary of the classifier. Third, to avoid structural permutation in
the semantic space, we propose a relative positioning strategy to manipulate
the attribute embeddings, guiding which to fully preserve the inter-class
geometric structure. Experimental results demonstrate that GSMFlow achieves
state-of-the-art recognition performance in both conventional and generalized
zero-shot settings. Our code is available at:
https://github.com/uqzhichen/GSMFlow
Related papers
- Accurate Explanation Model for Image Classifiers using Class Association Embedding [5.378105759529487]
We propose a generative explanation model that combines the advantages of global and local knowledge.
Class association embedding (CAE) encodes each sample into a pair of separated class-associated and individual codes.
Building-block coherency feature extraction algorithm is proposed that efficiently separates class-associated features from individual ones.
arXiv Detail & Related papers (2024-06-12T07:41:00Z) - SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized
Zero-Shot Learning [0.7420433640907689]
Generalized Zero-Shot Learning (GZSL) recognizes unseen classes by transferring knowledge from the seen classes.
This paper introduces a dual strategy to address the generalization gap.
arXiv Detail & Related papers (2023-12-20T15:18:51Z) - Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor
Point Clouds [69.64240235315864]
This paper introduces the synthetic-to-real domain generalization setting to this task.
The domain gap between synthetic and real-world point cloud data mainly lies in the different layouts and point patterns.
Experiments on the synthetic-to-real benchmark demonstrate that both CINMix and multi-prototypes can narrow the distribution gap.
arXiv Detail & Related papers (2022-12-09T05:07:43Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Latent-Insensitive Autoencoders for Anomaly Detection and
Class-Incremental Learning [0.0]
We introduce Latent-Insensitive Autoencoder (LIS-AE) where unlabeled data from a similar domain is utilized as negative examples to shape the latent layer (bottleneck) of a regular autoencoder.
We treat class-incremental learning as multiple anomaly detection tasks by adding a different latent layer for each class and use other available classes in task as negative examples to shape each latent layer.
arXiv Detail & Related papers (2021-10-25T16:53:49Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Latent Embedding Feedback and Discriminative Features for Zero-Shot
Classification [139.44681304276]
zero-shot learning aims to classify unseen categories for which no data is available during training.
Generative Adrial Networks synthesize unseen class features by leveraging class-specific semantic embeddings.
We propose to enforce semantic consistency at all stages of zero-shot learning: training, feature synthesis and classification.
arXiv Detail & Related papers (2020-03-17T17:34:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.