Rethinking Self-Supervised Learning: Small is Beautiful
- URL: http://arxiv.org/abs/2103.13559v1
- Date: Thu, 25 Mar 2021 01:48:52 GMT
- Title: Rethinking Self-Supervised Learning: Small is Beautiful
- Authors: Yun-Hao Cao and Jianxin Wu
- Abstract summary: We propose scaled-down self-supervised learning (S3L), which include 3 parts: small resolution, small architecture and small data.
On a diverse set of datasets, S3L achieves higher accuracy consistently with much less training cost when compared to previous SSL learning paradigm.
- Score: 30.809693803413445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning (SSL), in particular contrastive learning, has made
great progress in recent years. However, a common theme in these methods is
that they inherit the learning paradigm from the supervised deep learning
scenario. Current SSL methods are often pretrained for many epochs on
large-scale datasets using high resolution images, which brings heavy
computational cost and lacks flexibility. In this paper, we demonstrate that
the learning paradigm for SSL should be different from supervised learning and
the information encoded by the contrastive loss is expected to be much less
than that encoded in the labels in supervised learning via the cross entropy
loss. Hence, we propose scaled-down self-supervised learning (S3L), which
include 3 parts: small resolution, small architecture and small data. On a
diverse set of datasets, SSL methods and backbone architectures, S3L achieves
higher accuracy consistently with much less training cost when compared to
previous SSL learning paradigm. Furthermore, we show that even without a large
pretraining dataset, S3L can achieve impressive results on small data alone.
Our code has been made publically available at
https://github.com/CupidJay/Scaled-down-self-supervised-learning.
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Visual Self-supervised Learning Scheme for Dense Prediction Tasks on X-ray Images [3.782392436834913]
Self-supervised learning (SSL) has led to considerable progress in natural language processing (NLP)
However, the incorporation of contrastive learning into existing visual SSL models has led to considerable progress, often surpassing supervised counterparts.
Here, we focus on dense prediction tasks using security inspection x-ray images to evaluate our proposed model, Segment localization (SegLoc)
Based upon the Instance localization (InsLoc) model, SegLoc addresses one of the key challenges of contrastive learning, i.e., false negative pairs of query embeddings.
arXiv Detail & Related papers (2023-10-12T15:42:17Z) - Data-Efficient Contrastive Self-supervised Learning: Most Beneficial
Examples for Supervised Learning Contribute the Least [14.516008359896421]
Self-supervised learning (SSL) learns high-quality representations from large pools of unlabeled training data.
As datasets grow larger, it becomes crucial to identify the examples that contribute the most to learning such representations.
We prove that examples that contribute the most to contrastive SSL are those that have the most similar augmentations to other examples.
arXiv Detail & Related papers (2023-02-18T00:15:06Z) - Evaluating Self-Supervised Learning via Risk Decomposition [100.73914689472507]
Self-supervised learning (SSL) pipelines differ in many design choices such as the architecture, augmentations, or pretraining data.
This does not provide much insight into why or when a model is better, now how to improve it.
We propose an SSL risk decomposition, which generalizes the classical supervised approximation-estimation decomposition.
arXiv Detail & Related papers (2023-02-06T19:09:00Z) - A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends [82.64268080902742]
Self-supervised learning (SSL) aims to learn discriminative features from unlabeled data without relying on human-annotated labels.
SSL has garnered significant attention recently, leading to the development of numerous related algorithms.
This paper presents a review of diverse SSL methods, encompassing algorithmic aspects, application domains, three key trends, and open research questions.
arXiv Detail & Related papers (2023-01-13T14:41:05Z) - DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision [16.757456364034798]
Semi-supervised learning (SSL) is becoming increasingly popular.
We propose a new SSL algorithm, DoubleMatch, which combines the pseudo-labeling technique with a self-supervised loss.
We show that this method achieves state-of-the-art accuracies on multiple benchmark datasets while also reducing training times compared to existing SSL methods.
arXiv Detail & Related papers (2022-05-11T15:43:48Z) - DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL)
Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Solo-learn: A Library of Self-supervised Methods for Visual
Representation Learning [83.02597612195966]
solo-learn is a library of self-supervised methods for visual representation learning.
Implemented in Python, using Pytorch and Pytorch lightning, the library fits both research and industry needs.
arXiv Detail & Related papers (2021-08-03T22:19:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.