Related papers: Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning

URL: http://arxiv.org/abs/2007.01293v2
Date: Thu, 29 Oct 2020 04:29:54 GMT
Title: Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
Authors: Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing
Abstract summary: We show how to use a different weight for every unlabeled example. We adjust those weights via an algorithm based on the influence function. We demonstrate that this technique outperforms state-of-the-art methods on semi-supervised image and language classification tasks.
Score: 135.89676456312247
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i.e., all unlabeled examples are equally weighted. But not all unlabeled data are equal. In this paper we study how to use a different weight for every unlabeled example. Manual tuning of all those weights -- as done in prior work -- is no longer possible. Instead, we adjust those weights via an algorithm based on the influence function, a measure of a model's dependency on one training example. To make the approach efficient, we propose a fast and effective approximation of the influence function. We demonstrate that this technique outperforms state-of-the-art methods on semi-supervised image and language classification tasks.

Related papers

Boosting Semi-Supervised Learning by bridging high and low-confidence predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL) We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z)
Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training. We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data. Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning. Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch. ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)
Unsupervised Deep Metric Learning via Orthogonality based Probabilistic Loss [27.955068939695042]
Existing state-of-the-art metric learning approaches require class labels to learn a metric. We propose an unsupervised approach that learns a metric without making use of class labels. The pseudo-labels are used to form triplets of examples, which guide the metric learning.
arXiv Detail & Related papers (2020-08-22T17:13:33Z)
Don't Wait, Just Weight: Improving Unsupervised Representations by Learning Goal-Driven Instance Weights [92.16372657233394]
Self-supervised learning techniques can boost performance by learning useful representations from unlabelled data. We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy. Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon.
arXiv Detail & Related papers (2020-06-22T15:59:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.