Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing
- URL: http://arxiv.org/abs/2405.03565v1
- Date: Mon, 6 May 2024 15:38:32 GMT
- Title: Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing
- Authors: Han Liu, Siyang Zhao, Xiaotong Zhang, Feng Zhang, Wei Wang, Fenglong Ma, Hongyang Chen, Hong Yu, Xianchao Zhang,
- Abstract summary: Few-shot and zero-shot text classification aim to recognize samples from novel classes with limited labeled samples or no labeled samples at all.
We propose a simple and effective strategy for few-shot and zero-shot text classification.
- Score: 38.84431954053434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot and zero-shot text classification aim to recognize samples from novel classes with limited labeled samples or no labeled samples at all. While prevailing methods have shown promising performance via transferring knowledge from seen classes to unseen classes, they are still limited by (1) Inherent dissimilarities among classes make the transformation of features learned from seen classes to unseen classes both difficult and inefficient. (2) Rare labeled novel samples usually cannot provide enough supervision signals to enable the model to adjust from the source distribution to the target distribution, especially for complicated scenarios. To alleviate the above issues, we propose a simple and effective strategy for few-shot and zero-shot text classification. We aim to liberate the model from the confines of seen classes, thereby enabling it to predict unseen categories without the necessity of training on seen classes. Specifically, for mining more related unseen category knowledge, we utilize a large pre-trained language model to generate pseudo novel samples, and select the most representative ones as category anchors. After that, we convert the multi-class classification task into a binary classification task and use the similarities of query-anchor pairs for prediction to fully leverage the limited supervision signals. Extensive experiments on six widely used public datasets show that our proposed method can outperform other strong baselines significantly in few-shot and zero-shot tasks, even without using any seen class samples.
Related papers
- Boosting Few-Shot Text Classification via Distribution Estimation [38.99459686893034]
We propose two simple yet effective strategies to estimate the distributions of the novel classes by utilizing unlabeled query samples.
Specifically, we first assume a class or sample follows the Gaussian distribution, and use the original support set and the nearest few query samples.
Then, we augment the labeled samples by sampling from the estimated distribution, which can provide sufficient supervision for training the classification model.
arXiv Detail & Related papers (2023-03-26T05:58:39Z) - Generalization Bounds for Few-Shot Transfer Learning with Pretrained
Classifiers [26.844410679685424]
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes.
We show that the few-shot error of the learned feature map on new classes is small in case of class-feature-variability collapse.
arXiv Detail & Related papers (2022-12-23T18:46:05Z) - Are Fewer Labels Possible for Few-shot Learning? [81.89996465197392]
Few-shot learning is challenging due to its very limited data and labels.
Recent studies in big transfer (BiT) show that few-shot learning can greatly benefit from pretraining on large scale labeled dataset in a different domain.
We propose eigen-finetuning to enable fewer shot learning by leveraging the co-evolution of clustering and eigen-samples in the finetuning.
arXiv Detail & Related papers (2020-12-10T18:59:29Z) - Few-Shot Learning with Intra-Class Knowledge Transfer [100.87659529592223]
We consider the few-shot classification task with an unbalanced dataset.
Recent works have proposed to solve this task by augmenting the training data of the few-shot classes using generative models.
We propose to leverage the intra-class knowledge from the neighbor many-shot classes with the intuition that neighbor classes share similar statistical information.
arXiv Detail & Related papers (2020-08-22T18:15:38Z) - Cooperative Bi-path Metric for Few-shot Learning [50.98891758059389]
We make two contributions to investigate the few-shot classification problem.
We report a simple and effective baseline trained on base classes in the way of traditional supervised learning.
We propose a cooperative bi-path metric for classification, which leverages the correlations between base classes and novel classes to further improve the accuracy.
arXiv Detail & Related papers (2020-08-10T11:28:52Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine
Pseudo-Labeling with Visual-Semantic Meta-Embedding [13.063136901934865]
Few-shot learning aims at rapidly adapting to novel categories with only a handful of samples at test time.
In this paper, we advance the few-shot classification paradigm towards a more challenging scenario, i.e., cross-granularity few-shot classification.
We approximate the fine-grained data distribution by greedy clustering of each coarse-class into pseudo-fine-classes according to the similarity of image embeddings.
arXiv Detail & Related papers (2020-07-11T03:44:21Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.