DATA: Domain-Aware and Task-Aware Pre-training
- URL: http://arxiv.org/abs/2203.09041v1
- Date: Thu, 17 Mar 2022 02:38:49 GMT
- Title: DATA: Domain-Aware and Task-Aware Pre-training
- Authors: Qing Chang, Junran Peng, Lingxie Xie, Jiajun Sun, Haoran Yin, Qi Tian,
Zhaoxiang Zhang
- Abstract summary: We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL)
Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
- Score: 94.62676913928831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paradigm of training models on massive data without label through
self-supervised learning (SSL) and finetuning on many downstream tasks has
become a trend recently. However, due to the high training costs and the
unconsciousness of downstream usages, most self-supervised learning methods
lack the capability to correspond to the diversities of downstream scenarios,
as there are various data domains, different vision tasks and latency
constraints on models. Neural architecture search (NAS) is one universally
acknowledged fashion to conquer the issues above, but applying NAS on SSL seems
impossible as there is no label or metric provided for judging model selection.
In this paper, we present DATA, a simple yet effective NAS approach specialized
for SSL that provides Domain-Aware and Task-Aware pre-training. Specifically,
we (i) train a supernet which could be deemed as a set of millions of networks
covering a wide range of model scales without any label, (ii) propose a
flexible searching mechanism compatible with SSL that enables finding networks
of different computation costs, for various downstream vision tasks and data
domains without explicit metric provided. Instantiated With MoCo v2, our method
achieves promising results across a wide range of computation costs on
downstream tasks, including image classification, object detection and semantic
segmentation. DATA is orthogonal to most existing SSL methods and endows them
the ability of customization on downstream needs. Extensive experiments on
other SSL methods demonstrate the generalizability of the proposed method. Code
is released at https://github.com/GAIA-vision/GAIA-ssl
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Self-supervised visual learning in the low-data regime: a comparative evaluation [40.27083924454058]
Self-Supervised Learning (SSL) is a robust training methodology for contemporary Deep Neural Networks (DNNs)
This work introduces a taxonomy of modern visual SSL methods, accompanied by detailed explanations and insights regarding the main categories of approaches.
For domain-specific downstream tasks, in-domain low-data SSL pretraining outperforms the common approach of large-scale pretraining.
arXiv Detail & Related papers (2024-04-26T07:23:14Z) - On Pretraining Data Diversity for Self-Supervised Learning [57.91495006862553]
We explore the impact of training with more diverse datasets on the performance of self-supervised learning (SSL) under a fixed computational budget.
Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal.
arXiv Detail & Related papers (2024-03-20T17:59:58Z) - A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends [82.64268080902742]
Self-supervised learning (SSL) aims to learn discriminative features from unlabeled data without relying on human-annotated labels.
SSL has garnered significant attention recently, leading to the development of numerous related algorithms.
This paper presents a review of diverse SSL methods, encompassing algorithmic aspects, application domains, three key trends, and open research questions.
arXiv Detail & Related papers (2023-01-13T14:41:05Z) - Effective Self-supervised Pre-training on Low-compute Networks without
Distillation [6.530011859253459]
Reported performance of self-supervised learning has trailed behind standard supervised pre-training by a large margin.
Most prior works attribute this poor performance to the capacity bottleneck of the low-compute networks.
We take a closer at what are the detrimental factors causing the practical limitations, and whether they are intrinsic to the self-supervised low-compute setting.
arXiv Detail & Related papers (2022-10-06T10:38:07Z) - OpenLDN: Learning to Discover Novel Classes for Open-World
Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning.
Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data.
This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z) - Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples.
We provide a unified review of different ways of training graph neural networks (GNNs) using SSL.
Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z) - Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining
Paradigms [36.04356511882304]
Self-supervised learning (SSL) has demonstrated promising results on a wide range of applications.
There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other.
arXiv Detail & Related papers (2020-06-19T05:21:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.