Related papers: One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

URL: http://arxiv.org/abs/2203.08130v1
Date: Tue, 15 Mar 2022 17:54:57 GMT
Title: One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning
Authors: Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song
Abstract summary: We show that a network architecture plays a significant role in self-supervised learning (SSL) We conduct a study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting. We show that "self-supervised architectures" outperform popular handcrafted architectures while performing competitively with the larger and computationally heavy ResNet50.
Score: 45.34419286124694
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The current literature on self-supervised learning (SSL) focuses on developing learning objectives to train neural networks more effectively on unlabeled data. The typical development process involves taking well-established architectures, e.g., ResNet demonstrated on ImageNet, and using them to evaluate newly developed objectives on downstream scenarios. While convenient, this does not take into account the role of architectures which has been shown to be crucial in the supervised learning literature. In this work, we establish extensive empirical evidence showing that a network architecture plays a significant role in SSL. We conduct a large-scale study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting. We show that there is no one network that performs consistently well across the scenarios. Based on this, we propose to learn not only network weights but also architecture topologies in the SSL regime. We show that "self-supervised architectures" outperform popular handcrafted architectures (ResNet18 and MobileNetV2) while performing competitively with the larger and computationally heavy ResNet50 on major image classification benchmarks (ImageNet-1K, iNat2021, and more). Our results suggest that it is time to consider moving beyond handcrafted architectures in SSL and start thinking about incorporating architecture search into self-supervised learning objectives.

Related papers

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights [0.0]
We propose a training-free proxy for image classification accuracy based on Fisher Information. Our proxy achieves state-of-the-art results on three public datasets and in two search spaces.
arXiv Detail & Related papers (2025-02-07T14:48:28Z)
NASiam: Efficient Representation Learning using Neural Architecture Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL) NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair) NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z)
Hybrid BYOL-ViT: Efficient approach to deal with small Datasets [0.0]
In this paper, we investigate how self-supervision with strong and sufficient augmentation of unlabeled data can train effectively the first layers of a neural network. We show that the low-level features derived from a self-supervised architecture can improve the robustness and the overall performance of this emergent architecture.
arXiv Detail & Related papers (2021-11-08T21:44:31Z)
Self-Denoising Neural Networks for Few Shot Learning [66.38505903102373]
We present a new training scheme that adds noise at multiple stages of an existing neural architecture while simultaneously learning to be robust to this added noise. This architecture, which we call a Self-Denoising Neural Network (SDNN), can be applied easily to most modern convolutional neural architectures.
arXiv Detail & Related papers (2021-10-26T03:28:36Z)
Efficient Neural Architecture Search with Performance Prediction [0.0]
We use a neural architecture search to find the best network architecture for the task at hand. Existing NAS algorithms generally evaluate the fitness of a new architecture by fully training from scratch. An end-to-end offline performance predictor is proposed to accelerate the evaluation of sampled architectures.
arXiv Detail & Related papers (2021-08-04T05:44:16Z)
Pretraining Neural Architecture Search Controllers with Locality-based Self-Supervised Learning [0.0]
We propose a pretraining scheme that can be applied to controller-based NAS. Our method, locality-based self-supervised classification task, leverages the structural similarity of network architectures to obtain good architecture representations.
arXiv Detail & Related papers (2021-03-15T06:30:36Z)
D2RL: Deep Dense Architectures in Reinforcement Learning [47.67475810050311]
We take inspiration from successful architectural choices in computer vision and generative modelling. We investigate the use of deeper networks and dense connections for reinforcement learning on a variety of simulated robotic learning benchmark environments.
arXiv Detail & Related papers (2020-10-19T01:27:07Z)
A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures. A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications. These networks consist of stages, which are sets of layers that operate on representations in the same resolution. It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network. However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.