Related papers: Joint Embedding vs Reconstruction: Provable Benefits of Latent Space Prediction for Self Supervised Learning

Joint Embedding vs Reconstruction: Provable Benefits of Latent Space Prediction for Self Supervised Learning

URL: http://arxiv.org/abs/2505.12477v1
Date: Sun, 18 May 2025 15:54:55 GMT
Title: Joint Embedding vs Reconstruction: Provable Benefits of Latent Space Prediction for Self Supervised Learning
Authors: Hugues Van Assel, Mark Ibrahim, Tommaso Biancalani, Aviv Regev, Randall Balestriero,
Abstract summary: Reconstruction and joint embedding have emerged as two leading paradigms in Self Supervised Learning (SSL)<n>Both approaches offer compelling advantages, yet practitioners lack clear guidelines for choosing between them.
Score: 16.515613048905674
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reconstruction and joint embedding have emerged as two leading paradigms in Self Supervised Learning (SSL). Reconstruction methods focus on recovering the original sample from a different view in input space. On the other hand, joint embedding methods align the representations of different views in latent space. Both approaches offer compelling advantages, yet practitioners lack clear guidelines for choosing between them. In this work, we unveil the core mechanisms that distinguish each paradigm. By leveraging closed form solutions for both approaches, we precisely characterize how the view generation process, e.g. data augmentation, impacts the learned representations. We then demonstrate that, unlike supervised learning, both SSL paradigms require a minimal alignment between augmentations and irrelevant features to achieve asymptotic optimality with increasing sample size. Our findings indicate that in scenarios where these irrelevant features have a large magnitude, joint embedding methods are preferable because they impose a strictly weaker alignment condition compared to reconstruction based methods. These results not only clarify the trade offs between the two paradigms but also substantiate the empirical success of joint embedding approaches on real world challenging datasets.

Related papers

Feature-Based vs. GAN-Based Learning from Demonstrations: When and Why [50.191655141020505]
This survey provides a comparative analysis of feature-based and GAN-based approaches to learning from demonstrations.<n>We argue that the dichotomy between feature-based and GAN-based methods is increasingly nuanced.
arXiv Detail & Related papers (2025-07-08T11:45:51Z)
Few-Shot, No Problem: Descriptive Continual Relation Extraction [27.296604792388646]
Few-shot Continual Relation Extraction is a crucial challenge for enabling AI systems to identify and adapt to evolving relationships in real-world domains.<n>Traditional memory-based approaches often overfit to limited samples, failing to reinforce old knowledge.<n>We propose a novel retrieval-based solution, starting with a large language model to generate descriptions for each relation.
arXiv Detail & Related papers (2025-02-27T23:44:30Z)
The Common Stability Mechanism behind most Self-Supervised Learning Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques. We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO. We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z)
A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. We present a generative latent variable model for self-supervised learning. We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z)
Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction [4.7220779071424985]
Few-shot Relation Extraction (FSRE) aims to extract facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models. We introduce a novel synergistic anchored contrastive pre-training framework.
arXiv Detail & Related papers (2023-12-19T10:16:24Z)
Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations. We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z)
Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition [16.412306012741354]
We propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL) FoCoViL significantly suppresses the view-specific information on the representation space where the viewpoints are coarsely aligned. It associates actions with common view-invariant properties and simultaneously separates the dissimilar ones.
arXiv Detail & Related papers (2023-04-03T10:12:30Z)
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z)
On the duality between contrastive and non-contrastive self-supervised learning [0.0]
Self-supervised learning can be divided into contrastive and non-contrastive approaches. We show how close the contrastive and non-contrastive families can be. We also show the influence (or lack thereof) of design choices on downstream performance.
arXiv Detail & Related papers (2022-06-03T08:04:12Z)
Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods [19.587273175563745]
Self-Supervised Learning (SSL) surmises that inputs and pairwise positive relationships are enough to learn meaningful representations. This paper proposes a unifying framework under the helm of spectral manifold learning to address those limitations.
arXiv Detail & Related papers (2022-05-23T17:59:32Z)
Concurrent Discrimination and Alignment for Self-Supervised Feature Learning [52.213140525321165]
Existing self-supervised learning methods learn by means of pretext tasks which are either (1) discriminating that explicitly specify which features should be separated or (2) aligning that precisely indicate which features should be closed together. In this work, we combine the positive aspects of the discriminating and aligning methods, and design a hybrid method that addresses the above issue. Our method explicitly specifies the repulsion and attraction mechanism respectively by discriminative predictive task and concurrently maximizing mutual information between paired views. Our experiments on nine established benchmarks show that the proposed model consistently outperforms the existing state-of-the-art results of self-supervised and transfer
arXiv Detail & Related papers (2021-08-19T09:07:41Z)
ReSSL: Relational Self-Supervised Learning with Weak Augmentation [68.47096022526927]
Self-supervised learning has achieved great success in learning visual representations without data annotations. We introduce a novel relational SSL paradigm that learns representations by modeling the relationship between different instances. Our proposed ReSSL significantly outperforms the previous state-of-the-art algorithms in terms of both performance and training efficiency.
arXiv Detail & Related papers (2021-07-20T06:53:07Z)
Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL) It explores the characteristics of both conventional contrastive learning and deep clustering. It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.