Are Fewer Labels Possible for Few-shot Learning?
- URL: http://arxiv.org/abs/2012.05899v1
- Date: Thu, 10 Dec 2020 18:59:29 GMT
- Title: Are Fewer Labels Possible for Few-shot Learning?
- Authors: Suichan Li and Dongdong Chen and Yinpeng Chen and Lu Yuan and Lei
Zhang and Qi Chu and Nenghai Yu
- Abstract summary: Few-shot learning is challenging due to its very limited data and labels.
Recent studies in big transfer (BiT) show that few-shot learning can greatly benefit from pretraining on large scale labeled dataset in a different domain.
We propose eigen-finetuning to enable fewer shot learning by leveraging the co-evolution of clustering and eigen-samples in the finetuning.
- Score: 81.89996465197392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot learning is challenging due to its very limited data and labels.
Recent studies in big transfer (BiT) show that few-shot learning can greatly
benefit from pretraining on large scale labeled dataset in a different domain.
This paper asks a more challenging question: "can we use as few as possible
labels for few-shot learning in both pretraining (with no labels) and
fine-tuning (with fewer labels)?".
Our key insight is that the clustering of target samples in the feature space
is all we need for few-shot finetuning. It explains why the vanilla
unsupervised pretraining (poor clustering) is worse than the supervised one. In
this paper, we propose transductive unsupervised pretraining that achieves a
better clustering by involving target data even though its amount is very
limited. The improved clustering result is of great value for identifying the
most representative samples ("eigen-samples") for users to label, and in
return, continued finetuning with the labeled eigen-samples further improves
the clustering. Thus, we propose eigen-finetuning to enable fewer shot learning
by leveraging the co-evolution of clustering and eigen-samples in the
finetuning. We conduct experiments on 10 different few-shot target datasets,
and our average few-shot performance outperforms both vanilla inductive
unsupervised transfer and supervised transfer by a large margin. For instance,
when each target category only has 10 labeled samples, the mean accuracy gain
over the above two baselines is 9.2% and 3.42 respectively.
Related papers
- Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing [38.84431954053434]
Few-shot and zero-shot text classification aim to recognize samples from novel classes with limited labeled samples or no labeled samples at all.
We propose a simple and effective strategy for few-shot and zero-shot text classification.
arXiv Detail & Related papers (2024-05-06T15:38:32Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - Improve Unsupervised Pretraining for Few-label Transfer [80.58625921631506]
In this paper, we find this conclusion may not hold when the target dataset has very few labeled samples for finetuning.
We propose a new progressive few-label transfer algorithm for real applications.
arXiv Detail & Related papers (2021-07-26T17:59:56Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Boosting the Performance of Semi-Supervised Learning with Unsupervised
Clustering [10.033658645311188]
We show that ignoring labels altogether for whole epochs intermittently during training can significantly improve performance in the small sample regime.
We demonstrate our method's efficacy in boosting several state-of-the-art SSL algorithms.
arXiv Detail & Related papers (2020-12-01T14:19:14Z) - Shot in the Dark: Few-Shot Learning with No Base-Class Labels [32.96824710484196]
We show that off-the-shelf self-supervised learning outperforms transductive few-shot methods by 3.9% for 5-shot accuracy on miniImageNet.
This motivates us to examine more carefully the role of features learned through self-supervision in few-shot learning.
arXiv Detail & Related papers (2020-10-06T02:05:27Z) - Few-Shot Learning with Intra-Class Knowledge Transfer [100.87659529592223]
We consider the few-shot classification task with an unbalanced dataset.
Recent works have proposed to solve this task by augmenting the training data of the few-shot classes using generative models.
We propose to leverage the intra-class knowledge from the neighbor many-shot classes with the intuition that neighbor classes share similar statistical information.
arXiv Detail & Related papers (2020-08-22T18:15:38Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Instance Credibility Inference for Few-Shot Learning [45.577880041135785]
Few-shot learning aims to recognize new objects with extremely limited training data for each category.
This paper presents a simple statistical approach, dubbed Instance Credibility Inference (ICI) to exploit the distribution support of unlabeled instances for few-shot learning.
Our simple approach can establish new state-of-the-arts on four widely used few-shot learning benchmark datasets.
arXiv Detail & Related papers (2020-03-26T12:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.