Understanding Representation Learnability of Nonlinear Self-Supervised
Learning
- URL: http://arxiv.org/abs/2401.03214v1
- Date: Sat, 6 Jan 2024 13:23:26 GMT
- Title: Understanding Representation Learnability of Nonlinear Self-Supervised
Learning
- Authors: Ruofeng Yang, Xiangyuan Li, Bo Jiang, Shuai Li
- Abstract summary: Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks.
Our paper is the first to analyze the learning results of the nonlinear SSL model accurately.
- Score: 13.965135660149212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning (SSL) has empirically shown its data representation
learnability in many downstream tasks. There are only a few theoretical works
on data representation learnability, and many of those focus on final data
representation, treating the nonlinear neural network as a ``black box".
However, the accurate learning results of neural networks are crucial for
describing the data distribution features learned by SSL models. Our paper is
the first to analyze the learning results of the nonlinear SSL model
accurately. We consider a toy data distribution that contains two features: the
label-related feature and the hidden feature. Unlike previous linear setting
work that depends on closed-form solutions, we use the gradient descent
algorithm to train a 1-layer nonlinear SSL model with a certain initialization
region and prove that the model converges to a local minimum. Furthermore,
different from the complex iterative analysis, we propose a new analysis
process which uses the exact version of Inverse Function Theorem to accurately
describe the features learned by the local minimum. With this local minimum, we
prove that the nonlinear SSL model can capture the label-related feature and
hidden feature at the same time. In contrast, the nonlinear supervised learning
(SL) model can only learn the label-related feature. We also present the
learning processes and results of the nonlinear SSL and SL model via simulation
experiments.
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Progressive Feature Adjustment for Semi-supervised Learning from
Pretrained Models [39.42802115580677]
Semi-supervised learning (SSL) can leverage both labeled and unlabeled data to build a predictive model.
Recent literature suggests that naively applying state-of-the-art SSL with a pretrained model fails to unleash the full potential of training data.
We propose to use pseudo-labels from the unlabelled data to update the feature extractor that is less sensitive to incorrect labels.
arXiv Detail & Related papers (2023-09-09T01:57:14Z) - Representation Learning Dynamics of Self-Supervised Models [7.289672463326423]
Self-Supervised Learning (SSL) is an important paradigm for learning representations from unlabelled data.
We study the learning dynamics of SSL models, specifically representations obtained by minimising contrastive and non-contrastive losses.
We derive the exact learning dynamics of the SSL models trained using gradient descent on the Grassmannian manifold.
arXiv Detail & Related papers (2023-09-05T07:48:45Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Functional Nonlinear Learning [0.0]
We propose a functional nonlinear learning (FunNoL) method to represent multivariate functional data in a lower-dimensional feature space.
We show that FunNoL provides satisfactory curve classification and reconstruction regardless of data sparsity.
arXiv Detail & Related papers (2022-06-22T23:47:45Z) - Understanding the Role of Nonlinearity in Training Dynamics of
Contrastive Learning [37.27098255569438]
We study the role of nonlinearity in the training dynamics of contrastive learning (CL) on one and two-layer nonlinear networks.
We show that the presence of nonlinearity leads to many local optima even in 1-layer setting.
For 2-layer setting, we also discover emphglobal modulation: those local patterns discriminative from the perspective of global-level patterns are prioritized to learn.
arXiv Detail & Related papers (2022-06-02T23:52:35Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples.
We provide a unified review of different ways of training graph neural networks (GNNs) using SSL.
Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z) - Understanding self-supervised Learning Dynamics without Contrastive
Pairs [72.1743263777693]
Contrastive approaches to self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point.
BYOL and SimSiam, show remarkable performance it without negative pairs.
We study the nonlinear learning dynamics of non-contrastive SSL in simple linear networks.
arXiv Detail & Related papers (2021-02-12T22:57:28Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.