Spectal Harmonics: Bridging Spectral Embedding and Matrix Completion in
Self-Supervised Learning
- URL: http://arxiv.org/abs/2305.19818v2
- Date: Mon, 30 Oct 2023 15:45:09 GMT
- Title: Spectal Harmonics: Bridging Spectral Embedding and Matrix Completion in
Self-Supervised Learning
- Authors: Marina Munkhoeva, Ivan Oseledets
- Abstract summary: Self-supervised methods received tremendous attention thanks to their seemingly approach to learning representations that respect the semantics of the data without any apparent supervision in the form of labels.
A growing body of literature is already being published in an attempt to build a coherent and theoretically grounded understanding of the workings of a zoo of losses used in modern self-supervised representation learning methods.
- Score: 6.5151694672131875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised methods received tremendous attention thanks to their
seemingly heuristic approach to learning representations that respect the
semantics of the data without any apparent supervision in the form of labels. A
growing body of literature is already being published in an attempt to build a
coherent and theoretically grounded understanding of the workings of a zoo of
losses used in modern self-supervised representation learning methods. In this
paper, we attempt to provide an understanding from the perspective of a Laplace
operator and connect the inductive bias stemming from the augmentation process
to a low-rank matrix completion problem. To this end, we leverage the results
from low-rank matrix completion to provide theoretical analysis on the
convergence of modern SSL methods and a key property that affects their
downstream performance.
Related papers
- Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion [2.8948274245812335]
We investigate the implicit regularization of matrix factorization for solving matrix completion problems.
We empirically discover that the connectivity of observed data plays a crucial role in the implicit bias.
Our work reveals the intricate interplay between data connectivity, training dynamics, and implicit regularization in matrix factorization models.
arXiv Detail & Related papers (2024-05-22T15:12:14Z) - The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - A phase transition between positional and semantic learning in a solvable model of dot-product attention [30.96921029675713]
Morelinear model dot-product attention is studied as a non-dimensional self-attention layer with trainable and low-dimensional query and key data.
We show that either a positional attention mechanism (with tokens each other based on their respective positions) or a semantic attention mechanism (with tokens tied to each other based their meaning) or a transition from the former to the latter with increasing sample complexity.
arXiv Detail & Related papers (2024-02-06T11:13:54Z) - Sparsity-Guided Holistic Explanation for LLMs with Interpretable
Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications.
We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z) - Hierarchical Decomposition of Prompt-Based Continual Learning:
Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice.
HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics.
Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z) - Understanding Self-Supervised Learning of Speech Representation via
Invariance and Redundancy Reduction [0.45060992929802207]
Self-supervised learning (SSL) has emerged as a promising paradigm for learning flexible speech representations from unlabeled data.
This study provides an empirical analysis of Barlow Twins (BT), an SSL technique inspired by theories of redundancy reduction in human perception.
arXiv Detail & Related papers (2023-09-07T10:23:59Z) - Spectral Decomposition Representation for Reinforcement Learning [100.0424588013549]
We propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy.
A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings.
An experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.
arXiv Detail & Related papers (2022-08-19T19:01:30Z) - A Free Lunch from the Noise: Provable and Practical Exploration for
Representation Learning [55.048010996144036]
We show that under some noise assumption, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free.
We propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise.
arXiv Detail & Related papers (2021-11-22T19:24:57Z) - Towards the Generalization of Contrastive Self-Supervised Learning [11.889992921445849]
We present a theoretical explanation of how contrastive self-supervised pre-trained models generalize to downstream tasks.
We further explore SimCLR and Barlow Twins, which are two canonical contrastive self-supervised methods.
arXiv Detail & Related papers (2021-11-01T07:39:38Z) - Disambiguation of weak supervision with exponential convergence rates [88.99819200562784]
In supervised learning, data are annotated with incomplete yet discriminative information.
In this paper, we focus on partial labelling, an instance of weak supervision where, from a given input, we are given a set of potential targets.
We propose an empirical disambiguation algorithm to recover full supervision from weak supervision.
arXiv Detail & Related papers (2021-02-04T18:14:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.