How Useful is Self-Supervised Pretraining for Visual Tasks?
- URL: http://arxiv.org/abs/2003.14323v1
- Date: Tue, 31 Mar 2020 16:03:22 GMT
- Title: How Useful is Self-Supervised Pretraining for Visual Tasks?
- Authors: Alejandro Newell, Jia Deng
- Abstract summary: We evaluate various self-supervised algorithms across a comprehensive array of synthetic datasets and downstream tasks.
Our experiments offer insights into how the utility of self-supervision changes as the number of available labels grows.
- Score: 133.1984299177874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances have spurred incredible progress in self-supervised
pretraining for vision. We investigate what factors may play a role in the
utility of these pretraining methods for practitioners. To do this, we evaluate
various self-supervised algorithms across a comprehensive array of synthetic
datasets and downstream tasks. We prepare a suite of synthetic data that
enables an endless supply of annotated images as well as full control over
dataset difficulty. Our experiments offer insights into how the utility of
self-supervision changes as the number of available labels grows as well as how
the utility changes as a function of the downstream task and the properties of
the training data. We also find that linear evaluation does not correlate with
finetuning performance. Code and data is available at
\href{https://www.github.com/princeton-vl/selfstudy}{github.com/princeton-vl/selfstudy}.
Related papers
- Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration [54.8229698058649]
We study how unlabeled prior trajectory data can be leveraged to learn efficient exploration strategies.
Our method SUPE (Skills from Unlabeled Prior data for Exploration) demonstrates that a careful combination of these ideas compounds their benefits.
We empirically show that SUPE reliably outperforms prior strategies, successfully solving a suite of long-horizon, sparse-reward tasks.
arXiv Detail & Related papers (2024-10-23T17:58:45Z) - An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging [6.363158395541767]
Self-supervised learning has emerged as a powerful way to pre-train generalizable machine learning models on large amounts of unlabeled data.
In this study, we investigate and compare the performance of new self-supervised methods for music tagging.
arXiv Detail & Related papers (2024-04-14T07:56:08Z) - What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - Adversarial Auto-Augment with Label Preservation: A Representation
Learning Principle Guided Approach [95.74102207187545]
We show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle.
We then propose a practical surrogate to the objective that can be efficiently optimized and integrated seamlessly into existing methods.
arXiv Detail & Related papers (2022-11-02T02:02:51Z) - Where Should I Spend My FLOPS? Efficiency Evaluations of Visual
Pre-training Methods [29.141145775835106]
Given a fixed FLOP budget, what are the best datasets, models, and (self-supervised) training methods for obtaining high accuracy on representative visual tasks?
We examine five large-scale datasets (JFT-300M, ALIGN, ImageNet-1K, ImageNet-21K, and COCO) and six pre-training methods (CLIP, DINO, SimCLR, BYOL, Masked Autoencoding, and supervised)
Our results call into question the commonly-held assumption that self-supervised methods inherently scale to large, uncurated data.
arXiv Detail & Related papers (2022-09-30T17:04:55Z) - Self-Supervised Visual Representation Learning Using Lightweight
Architectures [0.0]
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine.
We critically examine the most notable pretext tasks to extract features from image data.
We study the performance of various self-supervised techniques keeping all other parameters uniform.
arXiv Detail & Related papers (2021-10-21T14:13:10Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Improving Few-Shot Learning with Auxiliary Self-Supervised Pretext Tasks [0.0]
Recent work on few-shot learning shows that quality of learned representations plays an important role in few-shot classification performance.
On the other hand, the goal of self-supervised learning is to recover useful semantic information of the data without the use of class labels.
We exploit the complementarity of both paradigms via a multi-task framework where we leverage recent self-supervised methods as auxiliary tasks.
arXiv Detail & Related papers (2021-01-24T23:21:43Z) - Interactive Weak Supervision: Learning Useful Heuristics for Data
Labeling [19.24454872492008]
Weak supervision offers a promising alternative for producing labeled datasets without ground truth labels.
We develop the first framework for interactive weak supervision in which a method proposes iterations and learns from user feedback.
Our experiments demonstrate that only a small number of feedback are needed to train models that achieve highly competitive test set performance.
arXiv Detail & Related papers (2020-12-11T00:10:38Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.