One Network Doesn't Rule Them All: Moving Beyond Handcrafted
Architectures in Self-Supervised Learning
- URL: http://arxiv.org/abs/2203.08130v1
- Date: Tue, 15 Mar 2022 17:54:57 GMT
- Title: One Network Doesn't Rule Them All: Moving Beyond Handcrafted
Architectures in Self-Supervised Learning
- Authors: Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital
Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song
- Abstract summary: We show that a network architecture plays a significant role in self-supervised learning (SSL)
We conduct a study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting.
We show that "self-supervised architectures" outperform popular handcrafted architectures while performing competitively with the larger and computationally heavy ResNet50.
- Score: 45.34419286124694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The current literature on self-supervised learning (SSL) focuses on
developing learning objectives to train neural networks more effectively on
unlabeled data. The typical development process involves taking
well-established architectures, e.g., ResNet demonstrated on ImageNet, and
using them to evaluate newly developed objectives on downstream scenarios.
While convenient, this does not take into account the role of architectures
which has been shown to be crucial in the supervised learning literature. In
this work, we establish extensive empirical evidence showing that a network
architecture plays a significant role in SSL. We conduct a large-scale study
with over 100 variants of ResNet and MobileNet architectures and evaluate them
across 11 downstream scenarios in the SSL setting. We show that there is no one
network that performs consistently well across the scenarios. Based on this, we
propose to learn not only network weights but also architecture topologies in
the SSL regime. We show that "self-supervised architectures" outperform popular
handcrafted architectures (ResNet18 and MobileNetV2) while performing
competitively with the larger and computationally heavy ResNet50 on major image
classification benchmarks (ImageNet-1K, iNat2021, and more). Our results
suggest that it is time to consider moving beyond handcrafted architectures in
SSL and start thinking about incorporating architecture search into
self-supervised learning objectives.
Related papers
- NASiam: Efficient Representation Learning using Neural Architecture
Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL)
NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair)
NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z) - Hybrid BYOL-ViT: Efficient approach to deal with small Datasets [0.0]
In this paper, we investigate how self-supervision with strong and sufficient augmentation of unlabeled data can train effectively the first layers of a neural network.
We show that the low-level features derived from a self-supervised architecture can improve the robustness and the overall performance of this emergent architecture.
arXiv Detail & Related papers (2021-11-08T21:44:31Z) - Self-Denoising Neural Networks for Few Shot Learning [66.38505903102373]
We present a new training scheme that adds noise at multiple stages of an existing neural architecture while simultaneously learning to be robust to this added noise.
This architecture, which we call a Self-Denoising Neural Network (SDNN), can be applied easily to most modern convolutional neural architectures.
arXiv Detail & Related papers (2021-10-26T03:28:36Z) - Efficient Neural Architecture Search with Performance Prediction [0.0]
We use a neural architecture search to find the best network architecture for the task at hand.
Existing NAS algorithms generally evaluate the fitness of a new architecture by fully training from scratch.
An end-to-end offline performance predictor is proposed to accelerate the evaluation of sampled architectures.
arXiv Detail & Related papers (2021-08-04T05:44:16Z) - Pretraining Neural Architecture Search Controllers with Locality-based
Self-Supervised Learning [0.0]
We propose a pretraining scheme that can be applied to controller-based NAS.
Our method, locality-based self-supervised classification task, leverages the structural similarity of network architectures to obtain good architecture representations.
arXiv Detail & Related papers (2021-03-15T06:30:36Z) - D2RL: Deep Dense Architectures in Reinforcement Learning [47.67475810050311]
We take inspiration from successful architectural choices in computer vision and generative modelling.
We investigate the use of deeper networks and dense connections for reinforcement learning on a variety of simulated robotic learning benchmark environments.
arXiv Detail & Related papers (2020-10-19T01:27:07Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.