Don't Wait, Just Weight: Improving Unsupervised Representations by
Learning Goal-Driven Instance Weights
- URL: http://arxiv.org/abs/2006.12360v1
- Date: Mon, 22 Jun 2020 15:59:32 GMT
- Title: Don't Wait, Just Weight: Improving Unsupervised Representations by
Learning Goal-Driven Instance Weights
- Authors: Linus Ericsson, Henry Gouk and Timothy M. Hospedales
- Abstract summary: Self-supervised learning techniques can boost performance by learning useful representations from unlabelled data.
We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy.
Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon.
- Score: 92.16372657233394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the absence of large labelled datasets, self-supervised learning
techniques can boost performance by learning useful representations from
unlabelled data, which is often more readily available. However, there is often
a domain shift between the unlabelled collection and the downstream target
problem data. We show that by learning Bayesian instance weights for the
unlabelled data, we can improve the downstream classification accuracy by
prioritising the most useful instances. Additionally, we show that the training
time can be reduced by discarding unnecessary datapoints. Our method,
BetaDataWeighter is evaluated using the popular self-supervised rotation
prediction task on STL-10 and Visual Decathlon. We compare to related instance
weighting schemes, both hand-designed heuristics and meta-learning, as well as
conventional self-supervised learning. BetaDataWeighter achieves both the
highest average accuracy and rank across datasets, and on STL-10 it prunes up
to 78% of unlabelled images without significant loss in accuracy, corresponding
to over 50% reduction in training time.
Related papers
- Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Boosting Semi-Supervised Learning by bridging high and low-confidence
predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL)
We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Investigating a Baseline Of Self Supervised Learning Towards Reducing
Labeling Costs For Image Classification [0.0]
The study implements the kaggle.com' cats-vs-dogs dataset, Mnist and Fashion-Mnist to investigate the self-supervised learning task.
Results show that the pretext process in the self-supervised learning improves the accuracy around 15% in the downstream classification task.
arXiv Detail & Related papers (2021-08-17T06:43:05Z) - Semi-Supervised Learning for Sparsely-Labeled Sequential Data:
Application to Healthcare Video Processing [0.8312466807725921]
We propose a semi-supervised machine learning training strategy to improve event detection performance on sequential data.
Our method uses noisy guesses of the events' end times to train event detection models.
We show that our strategy outperforms conservative estimates by 12 points of mean average precision for MNIST, and 3.5 points for CIFAR.
arXiv Detail & Related papers (2020-11-28T09:54:44Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Big Self-Supervised Models are Strong Semi-Supervised Learners [116.00752519907725]
We show that it is surprisingly effective for semi-supervised learning on ImageNet.
A key ingredient of our approach is the use of big (deep and wide) networks during pretraining and fine-tuning.
We find that, the fewer the labels, the more this approach (task-agnostic use of unlabeled data) benefits from a bigger network.
arXiv Detail & Related papers (2020-06-17T17:48:22Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.