How to trust unlabeled data? Instance Credibility Inference for Few-Shot
Learning
- URL: http://arxiv.org/abs/2007.08461v4
- Date: Tue, 11 May 2021 03:21:15 GMT
- Title: How to trust unlabeled data? Instance Credibility Inference for Few-Shot
Learning
- Authors: Yikai Wang, Li Zhang, Yuan Yao, Yanwei Fu
- Abstract summary: This paper presents a statistical approach, dubbed Instance Credibility Inference (ICI) to exploit the support of unlabeled instances for few-shot visual recognition.
We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.
- Score: 47.21354101796544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning based models have excelled in many computer vision tasks and
appear to surpass humans' performance. However, these models require an
avalanche of expensive human labeled training data and many iterations to train
their large number of parameters. This severely limits their scalability to the
real-world long-tail distributed categories, some of which are with a large
number of instances, but with only a few manually annotated. Learning from such
extremely limited labeled examples is known as Few-shot learning (FSL).
Different to prior arts that leverage meta-learning or data augmentation
strategies to alleviate this extremely data-scarce problem, this paper presents
a statistical approach, dubbed Instance Credibility Inference (ICI) to exploit
the support of unlabeled instances for few-shot visual recognition. Typically,
we repurpose the self-taught learning paradigm to predict pseudo-labels of
unlabeled instances with an initial classifier trained from the few shot and
then select the most confident ones to augment the training set to re-train the
classifier. This is achieved by constructing a (Generalized) Linear Model
(LM/GLM) with incidental parameters to model the mapping from (un-)labeled
features to their (pseudo-)labels, in which the sparsity of the incidental
parameters indicates the credibility of the corresponding pseudo-labeled
instance. We rank the credibility of pseudo-labeled instances along the
regularization path of their corresponding incidental parameters, and the most
trustworthy pseudo-labeled examples are preserved as the augmented labeled
instances. Theoretically, under mild conditions of restricted eigenvalue,
irrepresentability, and large error, our approach is guaranteed to collect all
the correctly-predicted instances from the noisy pseudo-labeled set.
Related papers
- InstanT: Semi-supervised Learning with Instance-dependent Thresholds [75.91684890150283]
We propose the study of instance-dependent thresholds, which has the highest degree of freedom compared with existing methods.
We devise a novel instance-dependent threshold function for all unlabeled instances by utilizing their instance-level ambiguity and the instance-dependent error rates of pseudo-labels.
arXiv Detail & Related papers (2023-10-29T05:31:43Z) - Boosting Semi-Supervised Learning by bridging high and low-confidence
predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL)
We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z) - Trustable Co-label Learning from Multiple Noisy Annotators [68.59187658490804]
Supervised deep learning depends on massive accurately annotated examples.
A typical alternative is learning from multiple noisy annotators.
This paper proposes a data-efficient approach, called emphTrustable Co-label Learning (TCL)
arXiv Detail & Related papers (2022-03-08T16:57:00Z) - Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data.
We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z) - Few-shot Learning via Dependency Maximization and Instance Discriminant
Analysis [21.8311401851523]
We study the few-shot learning problem, where a model learns to recognize new objects with extremely few labeled data per category.
We propose a simple approach to exploit unlabeled data accompanying the few-shot task for improving few-shot performance.
arXiv Detail & Related papers (2021-09-07T02:19:01Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning.
It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model.
It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z) - Instance Credibility Inference for Few-Shot Learning [45.577880041135785]
Few-shot learning aims to recognize new objects with extremely limited training data for each category.
This paper presents a simple statistical approach, dubbed Instance Credibility Inference (ICI) to exploit the distribution support of unlabeled instances for few-shot learning.
Our simple approach can establish new state-of-the-arts on four widely used few-shot learning benchmark datasets.
arXiv Detail & Related papers (2020-03-26T12:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.