Using Self-Supervised Pretext Tasks for Active Learning
- URL: http://arxiv.org/abs/2201.07459v1
- Date: Wed, 19 Jan 2022 07:58:06 GMT
- Title: Using Self-Supervised Pretext Tasks for Active Learning
- Authors: John Seon Keun Yi, Minseok Seo, Jongchan Park, Dong-Geol Choi
- Abstract summary: We propose a novel active learning approach that utilizes self-supervised pretext tasks and a unique data sampler to select data that are both difficult and representative.
The pretext task learner is trained on the unlabeled set, and the unlabeled data are sorted and grouped into batches by their pretext task losses.
In each iteration, the main task model is used to sample the most uncertain data in a batch to be annotated.
- Score: 7.214674613451605
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Labeling a large set of data is expensive. Active learning aims to tackle
this problem by asking to annotate only the most informative data from the
unlabeled set. We propose a novel active learning approach that utilizes
self-supervised pretext tasks and a unique data sampler to select data that are
both difficult and representative. We discover that the loss of a simple
self-supervised pretext task, such as rotation prediction, is closely
correlated to the downstream task loss. The pretext task learner is trained on
the unlabeled set, and the unlabeled data are sorted and grouped into batches
by their pretext task losses. In each iteration, the main task model is used to
sample the most uncertain data in a batch to be annotated. We evaluate our
method on various image classification and segmentation benchmarks and achieve
compelling performances on CIFAR10, Caltech-101, ImageNet, and CityScapes.
Related papers
- Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling [6.771578432805963]
Active learning aims to alleviate the amount of labor involved in data labeling by automating the selection of unlabeled samples.
We introduce novel techniques that significantly improve the use of abundant unlabeled data during training.
We demonstrate the superior performance of our approach over the state of the art on various image classification and segmentation benchmark datasets.
arXiv Detail & Related papers (2024-08-23T00:35:07Z) - Don't freeze: Finetune encoders for better Self-Supervised HAR [5.008235182488304]
We show how a simple change - not freezing the representation - leads to substantial performance gains across pretext tasks.
The improvement was found in all four investigated datasets and across all four pretext tasks and is proportional to amount of labelled data.
arXiv Detail & Related papers (2023-07-03T17:23:34Z) - Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled
Datasets [73.2096288987301]
We propose a simple approach that uses a small amount of downstream expert data to selectively query relevant behaviors from an offline, unlabeled dataset.
We observe that our method learns to query only the relevant transitions to the task, filtering out sub-optimal or task-irrelevant data.
Our simple querying approach outperforms more complex goal-conditioned methods by 20% across simulated and real robotic manipulation tasks from images.
arXiv Detail & Related papers (2023-04-18T05:42:53Z) - STUNT: Few-shot Tabular Learning with Self-generated Tasks from
Unlabeled Tables [64.0903766169603]
We propose a framework for few-shot semi-supervised learning, coined Self-generated Tasks from UNlabeled Tables (STUNT)
Our key idea is to self-generate diverse few-shot tasks by treating randomly chosen columns as a target label.
We then employ a meta-learning scheme to learn generalizable knowledge with the constructed tasks.
arXiv Detail & Related papers (2023-03-02T02:37:54Z) - Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task
Generalization [68.91386402390403]
We propose Unlabeled Data Augmented Instruction Tuning (UDIT) to take better advantage of the instructions during instruction learning.
We conduct extensive experiments to show UDIT's effectiveness in various scenarios of tasks and datasets.
arXiv Detail & Related papers (2022-10-17T15:25:24Z) - An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective.
Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z) - Budget-aware Few-shot Learning via Graph Convolutional Network [56.41899553037247]
This paper tackles the problem of few-shot learning, which aims to learn new visual concepts from a few examples.
A common problem setting in few-shot classification assumes random sampling strategy in acquiring data labels.
We introduce a new budget-aware few-shot learning problem that aims to learn novel object categories.
arXiv Detail & Related papers (2022-01-07T02:46:35Z) - Investigating a Baseline Of Self Supervised Learning Towards Reducing
Labeling Costs For Image Classification [0.0]
The study implements the kaggle.com' cats-vs-dogs dataset, Mnist and Fashion-Mnist to investigate the self-supervised learning task.
Results show that the pretext process in the self-supervised learning improves the accuracy around 15% in the downstream classification task.
arXiv Detail & Related papers (2021-08-17T06:43:05Z) - Learning to Rank for Active Learning: A Listwise Approach [36.72443179449176]
Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications.
In this work, we rethink the structure of the loss prediction module, using a simple but effective listwise approach.
Experimental results on four datasets demonstrate that our method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks.
arXiv Detail & Related papers (2020-07-31T21:05:16Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Task-Aware Variational Adversarial Active Learning [42.334671410592065]
We propose task-aware variational adversarial AL (TA-VAAL) that modifies task-agnostic VAAL.
Our proposed TA-VAAL outperforms state-of-the-arts on various benchmark datasets for classifications with balanced / imbalanced labels.
arXiv Detail & Related papers (2020-02-11T22:00:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.