Strength from Weakness: Fast Learning Using Weak Supervision
- URL: http://arxiv.org/abs/2002.08483v1
- Date: Wed, 19 Feb 2020 22:39:37 GMT
- Title: Strength from Weakness: Fast Learning Using Weak Supervision
- Authors: Joshua Robinson, Stefanie Jegelka, Suvrit Sra
- Abstract summary: Having access to weak labels can significantly accelerate the learning rate for the strong task to the fast rate of $mathcalO(nicefrac1n)$.
Actual acceleration depends continuously on the number of weak labels available, and on the relation between the two tasks.
- Score: 81.41106207042948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study generalization properties of weakly supervised learning. That is,
learning where only a few "strong" labels (the actual target of our prediction)
are present but many more "weak" labels are available. In particular, we show
that having access to weak labels can significantly accelerate the learning
rate for the strong task to the fast rate of $\mathcal{O}(\nicefrac1n)$, where
$n$ denotes the number of strongly labeled data points. This acceleration can
happen even if by itself the strongly labeled data admits only the slower
$\mathcal{O}(\nicefrac{1}{\sqrt{n}})$ rate. The actual acceleration depends
continuously on the number of weak labels available, and on the relation
between the two tasks. Our theoretical results are reflected empirically across
a range of tasks and illustrate how weak labels speed up learning on the strong
task.
Related papers
- Losses over Labels: Weakly Supervised Learning via Direct Loss
Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning.
We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label.
We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z) - On the Informativeness of Supervision Signals [31.418827619510036]
We use information theory to compare how a number of commonly-used supervision signals contribute to representation-learning performance.
Our framework provides theoretical justification for using hard labels in the big-data regime, but richer supervision signals for few-shot learning and out-of-distribution generalization.
arXiv Detail & Related papers (2022-11-02T18:02:31Z) - Label Noise-Resistant Mean Teaching for Weakly Supervised Fake News
Detection [93.6222609806278]
We propose a novel label noise-resistant mean teaching approach (LNMT) for weakly supervised fake news detection.
LNMT leverages unlabeled news and feedback comments of users to enlarge the amount of training data.
LNMT establishes a mean teacher framework equipped with label propagation and label reliability estimation.
arXiv Detail & Related papers (2022-06-10T16:01:58Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Active WeaSuL: Improving Weak Supervision with Active Learning [2.624902795082451]
We propose Active WeaSuL: an approach that incorporates active learning into weak supervision.
We make two contributions: 1) a modification of the weak supervision loss function, such that the expert-labelled data inform and improve the combination of weak labels; and 2) the maxKL divergence sampling strategy, which determines for which data points expert labelling is most beneficial.
arXiv Detail & Related papers (2021-04-30T08:58:26Z) - Self-Supervised Learning from Semantically Imprecise Data [7.24935792316121]
Learning from imprecise labels such as "animal" or "bird" is an important capability when expertly labeled training data is scarce.
CHILLAX is a recently proposed method to tackle this task.
We extend CHILLAX with a self-supervised scheme using constrained extrapolation to generate pseudo-labels.
arXiv Detail & Related papers (2021-04-22T07:26:14Z) - A Theoretical Analysis of Learning with Noisily Labeled Data [62.946840431501855]
We first show that in the first epoch training, the examples with clean labels will be learned first.
We then show that after the learning from clean data stage, continuously training model can achieve further improvement in testing error.
arXiv Detail & Related papers (2021-04-08T23:40:02Z) - Are Fewer Labels Possible for Few-shot Learning? [81.89996465197392]
Few-shot learning is challenging due to its very limited data and labels.
Recent studies in big transfer (BiT) show that few-shot learning can greatly benefit from pretraining on large scale labeled dataset in a different domain.
We propose eigen-finetuning to enable fewer shot learning by leveraging the co-evolution of clustering and eigen-samples in the finetuning.
arXiv Detail & Related papers (2020-12-10T18:59:29Z) - LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from
Explanation [40.72453599376169]
LEAN-LIFE is a web-based, Label-Efficient AnnotatioN framework for sequence labeling and classification tasks.
Our framework is the first to utilize this enhanced supervision technique and does so for three important tasks.
arXiv Detail & Related papers (2020-04-16T07:38:07Z) - Limitations of weak labels for embedding and tagging [0.0]
Many datasets and approaches in ambient sound analysis use weakly labeled data.Weak labels are employed because annotating every data sample with a strong label is too expensive.Yet, their impact on the performance in comparison to strong labels remains unclear.
In this paper, we formulate a supervised learning problem which involves weak labels.We create a dataset that focuses on the difference between strong and weak labels as opposed to other challenges.
arXiv Detail & Related papers (2020-02-05T08:54:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.