Weaker Than You Think: A Critical Look at Weakly Supervised Learning
- URL: http://arxiv.org/abs/2305.17442v3
- Date: Sun, 17 Sep 2023 19:04:44 GMT
- Title: Weaker Than You Think: A Critical Look at Weakly Supervised Learning
- Authors: Dawei Zhu, Xiaoyu Shen, Marius Mosbach, Andreas Stephan, Dietrich
Klakow
- Abstract summary: Weakly supervised learning is a popular approach for training machine learning models in low-resource settings.
We analyze diverse NLP datasets and tasks to ascertain when and why weakly supervised approaches work.
- Score: 30.160501243686863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised learning is a popular approach for training machine
learning models in low-resource settings. Instead of requesting high-quality
yet costly human annotations, it allows training models with noisy annotations
obtained from various weak sources. Recently, many sophisticated approaches
have been proposed for robust training under label noise, reporting impressive
results. In this paper, we revisit the setup of these approaches and find that
the benefits brought by these approaches are significantly overestimated.
Specifically, we find that the success of existing weakly supervised learning
approaches heavily relies on the availability of clean validation samples
which, as we show, can be leveraged much more efficiently by simply training on
them. After using these clean labels in training, the advantages of using these
sophisticated approaches are mostly wiped out. This remains true even when
reducing the size of the available clean data to just five samples per class,
making these approaches impractical. To understand the true value of weakly
supervised learning, we thoroughly analyze diverse NLP datasets and tasks to
ascertain when and why weakly supervised approaches work. Based on our
findings, we provide recommendations for future research.
Related papers
- Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance.
We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes.
To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z) - Fair Few-shot Learning with Auxiliary Sets [53.30014767684218]
In many machine learning (ML) tasks, only very few labeled data samples can be collected, which can lead to inferior fairness performance.
In this paper, we define the fairness-aware learning task with limited training samples as the emphfair few-shot learning problem.
We devise a novel framework that accumulates fairness-aware knowledge across different meta-training tasks and then generalizes the learned knowledge to meta-test tasks.
arXiv Detail & Related papers (2023-08-28T06:31:37Z) - Active Learning with Contrastive Pre-training for Facial Expression
Recognition [19.442685015494316]
We study 8 recent active learning methods on three public FER datasets.
Our findings show that existing active learning methods do not perform well in the context of FER.
We propose contrastive self-supervised pre-training, which first learns the underlying representations based on the entire unlabelled dataset.
arXiv Detail & Related papers (2023-07-06T03:08:03Z) - Unsupervised Embedding Quality Evaluation [6.72542623686684]
SSL models are often unclear whether they will perform well when transferred to another domain.
Can we quantify how easy it is to linearly separate the data in a stable way?
We introduce one novel method based on recent advances in understanding the high-dimensional geometric structure of self-supervised learning.
arXiv Detail & Related papers (2023-05-26T01:06:44Z) - An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective.
Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z) - SURF: Semi-supervised Reward Learning with Data Augmentation for
Feedback-efficient Preference-based Reinforcement Learning [168.89470249446023]
We present SURF, a semi-supervised reward learning framework that utilizes a large amount of unlabeled samples with data augmentation.
In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.
Our experiments demonstrate that our approach significantly improves the feedback-efficiency of the preference-based method on a variety of locomotion and robotic manipulation tasks.
arXiv Detail & Related papers (2022-03-18T16:50:38Z) - Exploiting All Samples in Low-Resource Sentence Classification: Early Stopping and Initialization Parameters [6.368871731116769]
In this study, we discuss how to exploit labeled samples without additional data or model redesigns.
We propose an integrated method, which is to initialize the model with a weight averaging method and use a non-validation stop method to train all samples.
Our results highlight the importance of the training strategy and suggest that the integrated method can be the first step in the low-resource setting.
arXiv Detail & Related papers (2021-11-12T22:31:47Z) - Active Learning for Argument Mining: A Practical Approach [2.535271349350579]
We show that Active Learning considerably decreases the effort necessary to get good deep learning performance on the task of Argument Unit Recognition and Classification (AURC)
Active Learning reduces the amount of data necessary for the training of machine learning models by querying the most informative samples for annotation and therefore is a promising method for resource creation.
arXiv Detail & Related papers (2021-09-28T10:58:47Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - An Effective Baseline for Robustness to Distributional Shift [5.627346969563955]
Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems.
We present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention.
arXiv Detail & Related papers (2021-05-15T00:46:11Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.