Related papers: How Useful is Self-Supervised Pretraining for Visual Tasks?

How Useful is Self-Supervised Pretraining for Visual Tasks?

URL: http://arxiv.org/abs/2003.14323v1
Date: Tue, 31 Mar 2020 16:03:22 GMT
Title: How Useful is Self-Supervised Pretraining for Visual Tasks?
Authors: Alejandro Newell, Jia Deng
Abstract summary: We evaluate various self-supervised algorithms across a comprehensive array of synthetic datasets and downstream tasks. Our experiments offer insights into how the utility of self-supervision changes as the number of available labels grows.
Score: 133.1984299177874
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances have spurred incredible progress in self-supervised pretraining for vision. We investigate what factors may play a role in the utility of these pretraining methods for practitioners. To do this, we evaluate various self-supervised algorithms across a comprehensive array of synthetic datasets and downstream tasks. We prepare a suite of synthetic data that enables an endless supply of annotated images as well as full control over dataset difficulty. Our experiments offer insights into how the utility of self-supervision changes as the number of available labels grows as well as how the utility changes as a function of the downstream task and the properties of the training data. We also find that linear evaluation does not correlate with finetuning performance. Code and data is available at \href{https://www.github.com/princeton-vl/selfstudy}{github.com/princeton-vl/selfstudy}.

Related papers

Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model [41.55165760439727]
Vision-language models (VLMs) have revolutionized machine learning by leveraging large pre-trained models to tackle various downstream tasks. We propose a graph-based approach for label-efficient adaptation and inference. Our method dynamically constructs a graph over text prompts, few-shot examples, and test samples, using label propagation for inference without task-specific tuning.
arXiv Detail & Related papers (2024-12-24T09:15:00Z)
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration [54.8229698058649]
We study how unlabeled prior trajectory data can be leveraged to learn efficient exploration strategies. Our method SUPE (Skills from Unlabeled Prior data for Exploration) demonstrates that a careful combination of these ideas compounds their benefits. We empirically show that SUPE reliably outperforms prior strategies, successfully solving a suite of long-horizon, sparse-reward tasks.
arXiv Detail & Related papers (2024-10-23T17:58:45Z)
An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging [6.363158395541767]
Self-supervised learning has emerged as a powerful way to pre-train generalizable machine learning models on large amounts of unlabeled data. In this study, we investigate and compare the performance of new self-supervised methods for music tagging.
arXiv Detail & Related papers (2024-04-14T07:56:08Z)
What Makes Pre-Trained Visual Representations Successful for Robust Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture. We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z)
SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks. We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction. Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z)
Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach [95.74102207187545]
We show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle. We then propose a practical surrogate to the objective that can be efficiently optimized and integrated seamlessly into existing methods.
arXiv Detail & Related papers (2022-11-02T02:02:51Z)
Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods [29.141145775835106]
Given a fixed FLOP budget, what are the best datasets, models, and (self-supervised) training methods for obtaining high accuracy on representative visual tasks? We examine five large-scale datasets (JFT-300M, ALIGN, ImageNet-1K, ImageNet-21K, and COCO) and six pre-training methods (CLIP, DINO, SimCLR, BYOL, Masked Autoencoding, and supervised) Our results call into question the commonly-held assumption that self-supervised methods inherently scale to large, uncurated data.
arXiv Detail & Related papers (2022-09-30T17:04:55Z)
Self-Supervised Visual Representation Learning Using Lightweight Architectures [0.0]
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine. We critically examine the most notable pretext tasks to extract features from image data. We study the performance of various self-supervised techniques keeping all other parameters uniform.
arXiv Detail & Related papers (2021-10-21T14:13:10Z)
Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy. In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online. We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z)
Improving Few-Shot Learning with Auxiliary Self-Supervised Pretext Tasks [0.0]
Recent work on few-shot learning shows that quality of learned representations plays an important role in few-shot classification performance. On the other hand, the goal of self-supervised learning is to recover useful semantic information of the data without the use of class labels. We exploit the complementarity of both paradigms via a multi-task framework where we leverage recent self-supervised methods as auxiliary tasks.
arXiv Detail & Related papers (2021-01-24T23:21:43Z)
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling [19.24454872492008]
Weak supervision offers a promising alternative for producing labeled datasets without ground truth labels. We develop the first framework for interactive weak supervision in which a method proposes iterations and learns from user feedback. Our experiments demonstrate that only a small number of feedback are needed to train models that achieve highly competitive test set performance.
arXiv Detail & Related papers (2020-12-11T00:10:38Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.