Cross-Architectural Positive Pairs improve the effectiveness of
Self-Supervised Learning
- URL: http://arxiv.org/abs/2301.12025v1
- Date: Fri, 27 Jan 2023 23:27:24 GMT
- Title: Cross-Architectural Positive Pairs improve the effectiveness of
Self-Supervised Learning
- Authors: Pranav Singh and Jacopo Cirrone
- Abstract summary: Cross Architectural - Self Supervision (CASS) is a novel self-supervised learning approach that leverages Transformer and CNN simultaneously.
We show that CASS-trained CNNs and Transformers across four diverse datasets gained an average of 3.8% with 1% labeled data.
We also show that CASS is much more robust to changes in batch size and training epochs than existing state-of-the-art self-supervised learning approaches.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing self-supervised techniques have extreme computational requirements
and suffer a substantial drop in performance with a reduction in batch size or
pretraining epochs. This paper presents Cross Architectural - Self Supervision
(CASS), a novel self-supervised learning approach that leverages Transformer
and CNN simultaneously. Compared to the existing state-of-the-art
self-supervised learning approaches, we empirically show that CASS-trained CNNs
and Transformers across four diverse datasets gained an average of 3.8% with 1%
labeled data, 5.9% with 10% labeled data, and 10.13% with 100% labeled data
while taking 69% less time. We also show that CASS is much more robust to
changes in batch size and training epochs than existing state-of-the-art
self-supervised learning approaches. We have open-sourced our code at
https://github.com/pranavsinghps1/CASS.
Related papers
- Local Masking Meets Progressive Freezing: Crafting Efficient Vision
Transformers for Self-Supervised Learning [0.0]
We present an innovative approach to self-supervised learning for Vision Transformers (ViTs)
This method focuses on enhancing the efficiency and speed of initial layer training in ViTs.
Our approach employs a novel multi-scale reconstruction process that fosters efficient learning in initial layers.
arXiv Detail & Related papers (2023-12-02T11:10:09Z) - Efficient Representation Learning for Healthcare with
Cross-Architectural Self-Supervision [5.439020425819001]
We present Cross Architectural - Self Supervision (CASS) in response to this challenge.
We show that CASS-trained CNNs and Transformers outperform existing self-supervised learning methods across four diverse healthcare datasets.
We also demonstrate that CASS is considerably more robust to variations in batch size and pretraining epochs, making it a suitable candidate for machine learning in healthcare applications.
arXiv Detail & Related papers (2023-08-19T15:57:19Z) - Boosting Visual-Language Models by Exploiting Hard Samples [126.35125029639168]
HELIP is a cost-effective strategy tailored to enhance the performance of existing CLIP models.
Our method allows for effortless integration with existing models' training pipelines.
On comprehensive benchmarks, HELIP consistently boosts existing models to achieve leading performance.
arXiv Detail & Related papers (2023-05-09T07:00:17Z) - Data-Efficient Augmentation for Training Neural Networks [15.870155099135538]
We propose a rigorous technique to select subsets of data points that when augmented, closely capture the training dynamics of full data augmentation.
Our method achieves 6.3x speedup on CIFAR10 and 2.2x speedup on SVHN, and outperforms the baselines by up to 10% across various subset sizes.
arXiv Detail & Related papers (2022-10-15T19:32:20Z) - CASS: Cross Architectural Self-Supervision for Medical Image Analysis [0.0]
Cross Architectural Self-Supervision is a novel self-supervised learning approach which leverages transformers and CNN simultaneously.
Compared to existing state-of-the-art self-supervised learning approaches, we empirically show CASS trained CNNs, and Transformers gained an average of 8.5% with 100% labelled data.
arXiv Detail & Related papers (2022-06-08T21:25:15Z) - Learning Rate Curriculum [75.98230528486401]
We propose a novel curriculum learning approach termed Learning Rate Curriculum (LeRaC)
LeRaC uses a different learning rate for each layer of a neural network to create a data-agnostic curriculum during the initial training epochs.
We compare our approach with Curriculum by Smoothing (CBS), a state-of-the-art data-agnostic curriculum learning approach.
arXiv Detail & Related papers (2022-05-18T18:57:36Z) - Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper.
Our method makes use of information from both intra- and inter-images.
It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z) - SEED: Self-supervised Distillation For Visual Representation [34.63488756535054]
We propose a new learning paradigm, named SElf-SupErvised Distillation (SEED), to transfer its representational knowledge into a smaller architecture (as Student) in a self-supervised fashion.
We show that SEED dramatically boosts the performance of small networks on downstream tasks.
arXiv Detail & Related papers (2021-01-12T20:04:50Z) - CoMatch: Semi-supervised Learning with Contrastive Graph Regularization [86.84486065798735]
CoMatch is a new semi-supervised learning method that unifies dominant approaches.
It achieves state-of-the-art performance on multiple datasets.
arXiv Detail & Related papers (2020-11-23T02:54:57Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Don't Wait, Just Weight: Improving Unsupervised Representations by
Learning Goal-Driven Instance Weights [92.16372657233394]
Self-supervised learning techniques can boost performance by learning useful representations from unlabelled data.
We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy.
Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon.
arXiv Detail & Related papers (2020-06-22T15:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.