Related papers: On the Importance of Hyperparameters and Data Augmentation for Self-Supervised Learning

On the Importance of Hyperparameters and Data Augmentation for Self-Supervised Learning

URL: http://arxiv.org/abs/2207.07875v1
Date: Sat, 16 Jul 2022 08:31:11 GMT
Title: On the Importance of Hyperparameters and Data Augmentation for Self-Supervised Learning
Authors: Diane Wagner, Fabio Ferreira, Danny Stoll, Robin Tibor Schirrmeister, Samuel M\"uller, Frank Hutter
Abstract summary: Self-Supervised Learning (SSL) has become a very active area of Deep Learning research where it is heavily used as a pre-training method for classification and other tasks. Here, we show that, indeed, the choice of hyper parameters and data augmentation strategies can have a dramatic impact on performance. We introduce a new automated data augmentation algorithm, GroupAugment, which considers groups of augmentations and optimize the sampling across groups.
Score: 32.53142486214591
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-Supervised Learning (SSL) has become a very active area of Deep Learning research where it is heavily used as a pre-training method for classification and other tasks. However, the rapid pace of advancements in this area comes at a price: training pipelines vary significantly across papers, which presents a potentially crucial confounding factor. Here, we show that, indeed, the choice of hyperparameters and data augmentation strategies can have a dramatic impact on performance. To shed light on these neglected factors and help maximize the power of SSL, we hyperparameterize these components and optimize them with Bayesian optimization, showing improvements across multiple datasets for the SimSiam SSL approach. Realizing the importance of data augmentations for SSL, we also introduce a new automated data augmentation algorithm, GroupAugment, which considers groups of augmentations and optimizes the sampling across groups. In contrast to algorithms designed for supervised learning, GroupAugment achieved consistently high linear evaluation accuracy across all datasets we considered. Overall, our results indicate the importance and likely underestimated role of data augmentation for SSL.

Related papers

Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z)
On the Diversity of Synthetic Data and its Impact on Training Large Language Models [34.00031258223175]
Large Language Models (LLMs) have accentuated the need for diverse, high-quality pre-training data. Synthetic data emerges as a viable solution to the challenges of data scarcity and inaccessibility. We study the downstream effects of synthetic data diversity during both the pre-training and fine-tuning stages.
arXiv Detail & Related papers (2024-10-19T22:14:07Z)
A Survey of the Self Supervised Learning Mechanisms for Vision Transformers [5.152455218955949]
The application of self supervised learning (SSL) in vision tasks has gained significant attention. We develop a comprehensive taxonomy of systematically classifying the SSL techniques. We discuss the motivations behind SSL, review popular pre-training tasks, and highlight the challenges and advancements in this field.
arXiv Detail & Related papers (2024-08-30T07:38:28Z)
SPEED: Scalable Preprocessing of EEG Data for Self-Supervised Learning [2.705542761685457]
We propose a Python-based EEG preprocessing pipeline optimized for self-supervised learning. This optimization stabilizes self-supervised training and enhances performance on downstream tasks.
arXiv Detail & Related papers (2024-08-15T10:15:01Z)
On Pretraining Data Diversity for Self-Supervised Learning [57.91495006862553]
We explore the impact of training with more diverse datasets on the performance of self-supervised learning (SSL) under a fixed computational budget. Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal.
arXiv Detail & Related papers (2024-03-20T17:59:58Z)
Boosting Transformer's Robustness and Efficacy in PPG Signal Artifact Detection with Self-Supervised Learning [0.0]
This study addresses the underutilization of abundant unlabeled data by employing self-supervised learning (SSL) to extract latent features from this data. Our experiments demonstrate that SSL significantly enhances the Transformer model's ability to learn representations. This approach holds promise for broader applications in PICU environments, where annotated data is often limited.
arXiv Detail & Related papers (2024-01-02T04:00:48Z)
Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning [5.438725298163702]
Self-Supervised Learning (SSL) contrastive learning has shown promise in mitigating the issue of data scarcity. Our research aims to explore and evaluate a wide range of audio-based augmentations and uncover combinations that enhance SSL model performance in PCG classification.
arXiv Detail & Related papers (2023-12-01T11:06:00Z)
Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls and Opportunities [50.231837687221685]
Self-supervised learning (SSL) has transformed machine learning and its many real world applications. Unsupervised anomaly detection (AD) has also capitalized on SSL, by self-generating pseudo-anomalies.
arXiv Detail & Related papers (2023-08-28T07:55:01Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
Time Series Contrastive Learning with Information-Aware Augmentations [57.45139904366001]
A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples. How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question. We propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning.
arXiv Detail & Related papers (2023-03-21T15:02:50Z)
Evolutionary Augmentation Policy Optimization for Self-supervised Learning [10.087678954934155]
Self-supervised learning is a machine learning algorithm for pretraining Deep Neural Networks (DNNs) without requiring manually labeled data. In this paper, we study the contribution of augmentation operators on the performance of self supervised learning algorithms.
arXiv Detail & Related papers (2023-03-02T21:16:53Z)
Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem. We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority. Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z)
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting. We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.