Balancing Generalization and Specialization in Zero-shot Learning
- URL: http://arxiv.org/abs/2201.01961v1
- Date: Thu, 6 Jan 2022 08:04:27 GMT
- Title: Balancing Generalization and Specialization in Zero-shot Learning
- Authors: Yun Li, Zhe Liu, Lina Yao, Xiaojun Chang
- Abstract summary: We propose an end-to-end network with balanced generalization and abilities, termed as BGSNet, to take advantage of both abilities.
A novel self-adjusting diversity loss is designed to optimize BSNet with less redundancy and more diversity.
Experiments on four benchmark datasets demonstrate our model's effectiveness.
- Score: 80.7530875747194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-Shot Learning (ZSL) aims to transfer classification capability from seen
to unseen classes. Recent methods have proved that generalization and
specialization are two essential abilities to achieve good performance in ZSL.
However, they all focus on only one of the abilities, resulting in models that
are either too general with the degraded classifying ability or too specialized
to generalize to unseen classes. In this paper, we propose an end-to-end
network with balanced generalization and specialization abilities, termed as
BGSNet, to take advantage of both abilities, and balance them at instance- and
dataset-level. Specifically, BGSNet consists of two branches: the
Generalization Network (GNet), which applies episodic meta-learning to learn
generalized knowledge, and the Balanced Specialization Network (BSNet), which
adopts multiple attentive extractors to extract discriminative features and
fulfill the instance-level balance. A novel self-adjusting diversity loss is
designed to optimize BSNet with less redundancy and more diversity. We further
propose a differentiable dataset-level balance and update the weights in a
linear annealing schedule to simulate network pruning and thus obtain the
optimal structure for BSNet at a low cost with dataset-level balance achieved.
Experiments on four benchmark datasets demonstrate our model's effectiveness.
Sufficient component ablations prove the necessity of integrating
generalization and specialization abilities.
Related papers
- Class-Balanced and Reinforced Active Learning on Graphs [13.239043161351482]
Graph neural networks (GNNs) have demonstrated significant success in various applications, such as node classification, link prediction, and graph classification.
Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a lower cost.
Most existing algorithms for reinforced active learning in GNNs may lead to a highly imbalanced class distribution, especially in highly skewed class scenarios.
We propose a novel class-balanced and reinforced active learning framework for GNNs, namely, GCBR. It learns an optimal policy to acquire class-balanced and informative nodes
arXiv Detail & Related papers (2024-02-15T16:37:14Z) - Simplifying Neural Network Training Under Class Imbalance [77.39968702907817]
Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models.
The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures.
We demonstrate that simply tuning existing components of standard deep learning pipelines, such as the batch size, data augmentation, and label smoothing, can achieve state-of-the-art performance without any such specialized class imbalance methods.
arXiv Detail & Related papers (2023-12-05T05:52:44Z) - Overcoming Recency Bias of Normalization Statistics in Continual
Learning: Balance and Adaptation [67.77048565738728]
Continual learning involves learning a sequence of tasks and balancing their knowledge appropriately.
We propose Adaptive Balance of BN (AdaB$2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions.
Our approach achieves significant performance gains across a wide range of benchmarks.
arXiv Detail & Related papers (2023-10-13T04:50:40Z) - RAHNet: Retrieval Augmented Hybrid Network for Long-tailed Graph
Classification [10.806893809269074]
We propose a novel framework called Retrieval Augmented Hybrid Network (RAHNet) to jointly learn a robust feature extractor and an unbiased classifier.
In the feature extractor training stage, we develop a graph retrieval module to search for relevant graphs that directly enrich the intra-class diversity for the tail classes.
We also innovatively optimize a category-centered supervised contrastive loss to obtain discriminative representations.
arXiv Detail & Related papers (2023-08-04T14:06:44Z) - Class Balancing GAN with a Classifier in the Loop [58.29090045399214]
We introduce a novel theoretically motivated Class Balancing regularizer for training GANs.
Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset.
We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets.
arXiv Detail & Related papers (2021-06-17T11:41:30Z) - Supervised Contrastive Learning for Pre-trained Language Model
Fine-tuning [23.00300794016583]
State-of-the-art natural language understanding classification models follow two-stages.
We propose a supervised contrastive learning (SCL) objective for the fine-tuning stage.
Our proposed fine-tuning objective leads to models that are more robust to different levels of noise in the fine-tuning training data.
arXiv Detail & Related papers (2020-11-03T01:10:39Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z) - Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes.
The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.