Uncertainty-aware Self-training for Text Classification with Few Labels
- URL: http://arxiv.org/abs/2006.15315v1
- Date: Sat, 27 Jun 2020 08:13:58 GMT
- Title: Uncertainty-aware Self-training for Text Classification with Few Labels
- Authors: Subhabrata Mukherjee, Ahmed Hassan Awadallah
- Abstract summary: We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
- Score: 54.13279574908808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent success of large-scale pre-trained language models crucially hinge on
fine-tuning them on large amounts of labeled data for the downstream task, that
are typically expensive to acquire. In this work, we study self-training as one
of the earliest semi-supervised learning approaches to reduce the annotation
bottleneck by making use of large-scale unlabeled data for the target task.
Standard self-training mechanism randomly samples instances from the unlabeled
pool to pseudo-label and augment labeled data. In this work, we propose an
approach to improve self-training by incorporating uncertainty estimates of the
underlying neural network leveraging recent advances in Bayesian deep learning.
Specifically, we propose (i) acquisition functions to select instances from the
unlabeled pool leveraging Monte Carlo (MC) Dropout, and (ii) learning mechanism
leveraging model confidence for self-training. As an application, we focus on
text classification on five benchmark datasets. We show our methods leveraging
only 20-30 labeled samples per class for each task for training and for
validation can perform within 3% of fully supervised pre-trained language
models fine-tuned on thousands of labeled instances with an aggregate accuracy
of 91% and improving by upto 12% over baselines.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models [3.546617486894182]
We introduce HAST, a new and effective self-training strategy, which is evaluated on four text classification benchmarks.
Results show that it outperforms the reproduced self-training approaches and reaches classification results comparable to previous experiments for three out of four datasets.
arXiv Detail & Related papers (2024-06-13T15:06:11Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Boosting Semi-Supervised Learning by bridging high and low-confidence
predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL)
We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Active Self-Semi-Supervised Learning for Few Labeled Samples [4.713652957384158]
Training deep models with limited annotations poses a significant challenge when applied to diverse practical domains.
We propose a simple yet effective framework, active self-semi-supervised learning (AS3L)
AS3L bootstraps semi-supervised models with prior pseudo-labels (PPL)
We develop active learning and label propagation strategies to obtain accurate PPL.
arXiv Detail & Related papers (2022-03-09T07:45:05Z) - Self-Training: A Survey [5.772546394254112]
Semi-supervised algorithms aim to learn prediction functions from a small set of labeled observations and a large set of unlabeled observations.
Among the existing techniques, self-training methods have undoubtedly attracted greater attention in recent years.
We present self-training methods for binary and multi-class classification; as well as their variants and two related approaches.
arXiv Detail & Related papers (2022-02-24T11:40:44Z) - Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint
Localization [88.74813798138466]
Localizing keypoints of an object is a basic visual problem.
Supervised learning of a keypoint localization network often requires a large amount of data.
We propose to automatically select reliable pseudo-labeled samples with a series of dynamic thresholds.
arXiv Detail & Related papers (2022-01-21T09:51:58Z) - Self-training Improves Pre-training for Natural Language Understanding [63.78927366363178]
We study self-training as another way to leverage unlabeled data through semi-supervised learning.
We introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data.
Our approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks.
arXiv Detail & Related papers (2020-10-05T17:52:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.