Related papers: A Survey of the Self Supervised Learning Mechanisms for Vision Transformers

A Survey of the Self Supervised Learning Mechanisms for Vision Transformers

URL: http://arxiv.org/abs/2408.17059v5
Date: Tue, 10 Jun 2025 05:53:17 GMT
Title: A Survey of the Self Supervised Learning Mechanisms for Vision Transformers
Authors: Asifullah Khan, Anabia Sohail, Mustansar Fiaz, Mehdi Hassan, Tariq Habib Afridi, Sibghat Ullah Marwat, Farzeen Munir, Safdar Ali, Hannan Naseem, Muhammad Zaigham Zaheer, Kamran Ali, Tangina Sultana, Ziaurrehman Tanoli, Naeem Akhter,
Abstract summary: Vision Transformers (ViTs) have recently demonstrated remarkable performance in computer vision tasks.<n>In response to this challenge, self-supervised learning (SSL) has emerged as a promising paradigm.<n>We propose a comprehensive taxonomy to classify SSL techniques based on their representations and pre-training tasks.
Score: 5.152455218955949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision Transformers (ViTs) have recently demonstrated remarkable performance in computer vision tasks. However, their parameter-intensive nature and reliance on large amounts of data for effective performance have shifted the focus from traditional human-annotated labels to unsupervised learning and pretraining strategies that uncover hidden structures within the data. In response to this challenge, self-supervised learning (SSL) has emerged as a promising paradigm. SSL leverages inherent relationships within the data itself as a form of supervision, eliminating the need for manual labeling and offering a more scalable and resource-efficient alternative for model training. Given these advantages, it is imperative to explore the integration of SSL techniques with ViTs, particularly in scenarios with limited labeled data. Inspired by this evolving trend, this survey aims to systematically review SSL mechanisms tailored for ViTs. We propose a comprehensive taxonomy to classify SSL techniques based on their representations and pre-training tasks. Additionally, we discuss the motivations behind SSL, review prominent pre-training tasks, and highlight advancements and challenges in this field. Furthermore, we conduct a comparative analysis of various SSL methods designed for ViTs, evaluating their strengths, limitations, and applicability to different scenarios.

Related papers

Revisiting semi-supervised learning in the era of foundation models [28.414667991336067]
Semi-supervised learning (SSL) leverages abundant unlabeled data alongside limited labeled data to enhance learning. We develop new SSL benchmark datasets where frozen vision foundation models (VFMs) underperform and systematically evaluate representative SSL methods. We make a surprising observation: parameter-efficient fine-tuning (PEFT) using only labeled data often matches SSL performance, even without leveraging unlabeled data. To overcome the notorious issue of noisy pseudo-labels, we propose ensembling multiple PEFT approaches and VFM backbones to produce more robust pseudo-labels.
arXiv Detail & Related papers (2025-03-12T18:01:10Z)
What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets. We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning. The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z)
Self-supervised visual learning in the low-data regime: a comparative evaluation [40.27083924454058]
Self-Supervised Learning (SSL) is a robust training methodology for contemporary Deep Neural Networks (DNNs) This work introduces a taxonomy of modern visual SSL methods, accompanied by detailed explanations and insights regarding the main categories of approaches. For domain-specific downstream tasks, in-domain low-data SSL pretraining outperforms the common approach of large-scale pretraining.
arXiv Detail & Related papers (2024-04-26T07:23:14Z)
Can We Break Free from Strong Data Augmentations in Self-Supervised Learning? [18.83003310612038]
Self-supervised learning (SSL) has emerged as a promising solution for addressing the challenge of limited labeled data in deep neural networks (DNNs) We explore SSL behavior across a spectrum of augmentations, revealing their crucial role in shaping SSL model performance and learning mechanisms. We propose a novel learning approach that integrates prior knowledge, with the aim of curtailing the need for extensive data augmentations.
arXiv Detail & Related papers (2024-04-15T12:53:48Z)
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data [15.796140543132196]
Self-supervised learning (SSL) has been incorporated into many state-of-the-art models in various domains. This survey aims to systematically review and summarize the recent progress and challenges of SSL for non-sequential data (SSL4NS-TD) We first present a formal definition of NS-TD and clarify its correlation to related studies. Then, these approaches are categorized into three groups - predictive learning, contrastive learning, and hybrid learning, with their motivations and strengths of representative methods in each direction.
arXiv Detail & Related papers (2024-02-02T08:17:41Z)
Evaluating Fairness in Self-supervised and Supervised Models for Sequential Data [10.626503137418636]
Self-supervised learning (SSL) has become the de facto training paradigm of large models. This study explores the impact of pre-training and fine-tuning strategies on fairness.
arXiv Detail & Related papers (2024-01-03T09:31:43Z)
Improving Representation Learning for Histopathologic Images with Cluster Constraints [31.426157660880673]
Self-supervised learning (SSL) pretraining strategies are emerging as a viable alternative. We introduce an SSL framework for transferable representation learning and semantically meaningful clustering. Our approach outperforms common SSL methods in downstream classification and clustering tasks.
arXiv Detail & Related papers (2023-10-18T21:20:44Z)
Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls and Opportunities [50.231837687221685]
Self-supervised learning (SSL) has transformed machine learning and its many real world applications. Unsupervised anomaly detection (AD) has also capitalized on SSL, by self-generating pseudo-anomalies.
arXiv Detail & Related papers (2023-08-28T07:55:01Z)
Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects [84.6945070729684]
Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. This article reviews current state-of-the-art SSL methods for time series data.
arXiv Detail & Related papers (2023-06-16T18:23:10Z)
Reverse Engineering Self-Supervised Learning [17.720366509919167]
Self-supervised learning (SSL) is a powerful tool in machine learning. This paper presents an in-depth empirical analysis of SSL-trained representations.
arXiv Detail & Related papers (2023-05-24T23:15:28Z)
Explaining, Analyzing, and Probing Representations of Self-Supervised Learning Models for Sensor-based Human Activity Recognition [2.2082422928825136]
Self-supervised learning (SSL) frameworks have been extensively applied to sensor-based Human Activity Recognition (HAR) In this paper, we aim to analyze deep representations of two recent SSL frameworks, namely SimCLR and VICReg.
arXiv Detail & Related papers (2023-04-14T07:53:59Z)
Understanding and Improving the Role of Projection Head in Self-Supervised Learning [77.59320917894043]
Self-supervised learning (SSL) aims to produce useful feature representations without access to human-labeled data annotations. Current contrastive learning approaches append a parametrized projection head to the end of some backbone network to optimize the InfoNCE objective. This raises a fundamental question: Why is a learnable projection head required if we are to discard it after training?
arXiv Detail & Related papers (2022-12-22T05:42:54Z)
Does Decentralized Learning with Non-IID Unlabeled Data Benefit from Self Supervision? [51.00034621304361]
We study decentralized learning with unlabeled data through the lens of self-supervised learning (SSL) We study the effectiveness of contrastive learning algorithms under decentralized learning settings.
arXiv Detail & Related papers (2022-10-20T01:32:41Z)
Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data. We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z)
DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL) Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z)
Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance. Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations. We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z)
Graph-based Semi-supervised Learning: A Comprehensive Review [51.26862262550445]
Semi-supervised learning (SSL) has tremendous value in practice due to its ability to utilize both labeled data and unlabelled data. An important class of SSL methods is to naturally represent data as graphs, which corresponds to graph-based semi-supervised learning (GSSL) methods. GSSL methods have demonstrated their advantages in various domains due to their uniqueness of structure, the universality of applications, and their scalability to large scale data.
arXiv Detail & Related papers (2021-02-26T05:11:09Z)
On Data-Augmentation and Consistency-Based Semi-Supervised Learning [77.57285768500225]
Recently proposed consistency-based Semi-Supervised Learning (SSL) methods have advanced the state of the art in several SSL tasks. Despite these advances, the understanding of these methods is still relatively limited.
arXiv Detail & Related papers (2021-01-18T10:12:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.