Reducing Label Effort: Self-Supervised meets Active Learning
- URL: http://arxiv.org/abs/2108.11458v1
- Date: Wed, 25 Aug 2021 20:04:44 GMT
- Title: Reducing Label Effort: Self-Supervised meets Active Learning
- Authors: Javad Zolfaghari Bengar, Joost van de Weijer, Bartlomiej Twardowski,
Bogdan Raducanu
- Abstract summary: Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets.
Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort.
The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.
- Score: 32.4747118398236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning is a paradigm aimed at reducing the annotation effort by
training the model on actively selected informative and/or representative
samples. Another paradigm to reduce the annotation effort is self-training that
learns from a large amount of unlabeled data in an unsupervised way and
fine-tunes on few labeled samples. Recent developments in self-training have
achieved very impressive results rivaling supervised learning on some datasets.
The current work focuses on whether the two paradigms can benefit from each
other. We studied object recognition datasets including CIFAR10, CIFAR100 and
Tiny ImageNet with several labeling budgets for the evaluations. Our
experiments reveal that self-training is remarkably more efficient than active
learning at reducing the labeling effort, that for a low labeling budget,
active learning offers no benefit to self-training, and finally that the
combination of active learning and self-training is fruitful when the labeling
budget is high. The performance gap between active learning trained either with
self-training or from scratch diminishes as we approach to the point where
almost half of the dataset is labeled.
Related papers
- Active Learning to Guide Labeling Efforts for Question Difficulty Estimation [1.0514231683620516]
Transformer-based neural networks achieve state-of-the-art performance, primarily through supervised methods but with an isolated study in unsupervised learning.
This work bridges the research gap by exploring active learning for QDE, a supervised human-in-the-loop approach.
Experiments demonstrate that active learning with PowerVariance acquisition achieves a performance close to fully supervised models after labeling only 10% of the training data.
arXiv Detail & Related papers (2024-09-14T02:02:42Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - One-bit Supervision for Image Classification: Problem, Solution, and
Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification.
We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm.
In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z) - A Matter of Annotation: An Empirical Study on In Situ and Self-Recall Activity Annotations from Wearable Sensors [56.554277096170246]
We present an empirical study that evaluates and contrasts four commonly employed annotation methods in user studies focused on in-the-wild data collection.
For both the user-driven, in situ annotations, where participants annotate their activities during the actual recording process, and the recall methods, where participants retrospectively annotate their data at the end of each day, the participants had the flexibility to select their own set of activity classes and corresponding labels.
arXiv Detail & Related papers (2023-05-15T16:02:56Z) - Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - Active Self-Training for Weakly Supervised 3D Scene Semantic
Segmentation [17.27850877649498]
We introduce a method for weakly supervised segmentation of 3D scenes that combines self-training and active learning.
We demonstrate that our approach leads to an effective method that provides improvements in scene segmentation over previous works and baselines.
arXiv Detail & Related papers (2022-09-15T06:00:25Z) - Investigating a Baseline Of Self Supervised Learning Towards Reducing
Labeling Costs For Image Classification [0.0]
The study implements the kaggle.com' cats-vs-dogs dataset, Mnist and Fashion-Mnist to investigate the self-supervised learning task.
Results show that the pretext process in the self-supervised learning improves the accuracy around 15% in the downstream classification task.
arXiv Detail & Related papers (2021-08-17T06:43:05Z) - Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection.
We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn.
We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z) - On the Marginal Benefit of Active Learning: Does Self-Supervision Eat
Its Cake? [31.563514432259897]
We present a novel framework integrating self-supervised pretraining, active learning, and consistency-regularized self-training.
Our experiments reveal two key insights: (i) Self-supervised pre-training significantly improves semi-supervised learning, especially in the few-label regime.
We fail to observe any additional benefit of state-of-the-art active learning algorithms when combined with state-of-the-art S4L techniques.
arXiv Detail & Related papers (2020-11-16T17:34:55Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z) - Learning to Rank for Active Learning: A Listwise Approach [36.72443179449176]
Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications.
In this work, we rethink the structure of the loss prediction module, using a simple but effective listwise approach.
Experimental results on four datasets demonstrate that our method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks.
arXiv Detail & Related papers (2020-07-31T21:05:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.