Not All Unlabeled Data are Equal: Learning to Weight Data in
Semi-supervised Learning
- URL: http://arxiv.org/abs/2007.01293v2
- Date: Thu, 29 Oct 2020 04:29:54 GMT
- Title: Not All Unlabeled Data are Equal: Learning to Weight Data in
Semi-supervised Learning
- Authors: Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing
- Abstract summary: We show how to use a different weight for every unlabeled example.
We adjust those weights via an algorithm based on the influence function.
We demonstrate that this technique outperforms state-of-the-art methods on semi-supervised image and language classification tasks.
- Score: 135.89676456312247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing semi-supervised learning (SSL) algorithms use a single weight to
balance the loss of labeled and unlabeled examples, i.e., all unlabeled
examples are equally weighted. But not all unlabeled data are equal. In this
paper we study how to use a different weight for every unlabeled example.
Manual tuning of all those weights -- as done in prior work -- is no longer
possible. Instead, we adjust those weights via an algorithm based on the
influence function, a measure of a model's dependency on one training example.
To make the approach efficient, we propose a fast and effective approximation
of the influence function. We demonstrate that this technique outperforms
state-of-the-art methods on semi-supervised image and language classification
tasks.
Related papers
- Boosting Semi-Supervised Learning by bridging high and low-confidence
predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL)
We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Unsupervised Deep Metric Learning via Orthogonality based Probabilistic
Loss [27.955068939695042]
Existing state-of-the-art metric learning approaches require class labels to learn a metric.
We propose an unsupervised approach that learns a metric without making use of class labels.
The pseudo-labels are used to form triplets of examples, which guide the metric learning.
arXiv Detail & Related papers (2020-08-22T17:13:33Z) - Don't Wait, Just Weight: Improving Unsupervised Representations by
Learning Goal-Driven Instance Weights [92.16372657233394]
Self-supervised learning techniques can boost performance by learning useful representations from unlabelled data.
We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy.
Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon.
arXiv Detail & Related papers (2020-06-22T15:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.