Related papers: The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning

The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning

URL: http://arxiv.org/abs/2209.08622v1
Date: Sun, 18 Sep 2022 18:15:38 GMT
Title: The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning
Authors: Romain Cosentino, Sarath Shekkizhar, Mahdi Soltanolkotabi, Salman Avestimehr, Antonio Ortega
Abstract summary: Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision. We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each.
Score: 62.601681746034956
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision due to the inability of supervised models to learn representations that can generalize in domains with limited labels. The recent popularity of SSL has led to the development of several models that make use of diverse training strategies, architectures, and data augmentation policies with no existing unified framework to study or assess their effectiveness in transfer learning. We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each. Unlike existing approaches that consider mathematical approximations of the parameters, individual components, or optimization landscape, our work aims to explore the geometric properties of the representation manifolds learned by SSL models. Our proposed manifold graph metrics (MGMs) provide insights into the geometric similarities and differences between available SSL models, their invariances with respect to specific augmentations, and their performances on transfer learning tasks. Our key findings are two fold: (i) contrary to popular belief, the geometry of SSL models is not tied to its training paradigm (contrastive, non-contrastive, and cluster-based); (ii) we can predict the transfer learning capability for a specific model based on the geometric properties of its semantic and augmentation manifolds.

Related papers

Geometric Embedding Alignment via Curvature Matching in Transfer Learning [4.739852004969771]
We introduce a novel approach to integrate multiple models into a unified transfer learning framework.<n>By aligning the Ricci curvature of latent space of individual models, we construct an interrelated architecture.<n>This framework enables the effective aggregation of knowledge from diverse sources, thereby improving performance on target tasks.
arXiv Detail & Related papers (2025-06-16T00:54:22Z)
Linear Representation Transferability Hypothesis: Leveraging Small Models to Steer Large Models [6.390475802910619]
We show that representations learned across models trained on the same data can be expressed as linear combinations of a emphuniversal set of basis features.<n>These basis features underlie the learning task itself and remain consistent across models, regardless of scale.
arXiv Detail & Related papers (2025-05-31T17:45:18Z)
Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities [4.389938747401259]
This work explores the effects of fine-tuning strategies on Large Language Models (LLMs) in domains such as materials science and engineering. We find that the merging of multiple fine-tuned models can lead to the emergence of capabilities that surpass the individual contributions of the parent models.
arXiv Detail & Related papers (2024-09-05T11:49:53Z)
Examining Changes in Internal Representations of Continual Learning Models Through Tensor Decomposition [5.01338577379149]
Continual learning (CL) has spurred the development of several methods aimed at consolidating previous knowledge across sequential learning. We propose a novel representation-based evaluation framework for CL models.
arXiv Detail & Related papers (2024-05-06T07:52:44Z)
Can Generative Models Improve Self-Supervised Representation Learning? [0.7999703756441756]
We introduce a novel framework that enriches the self-supervised learning paradigm by utilizing generative models to produce semantically consistent image augmentations. Our results show that our framework significantly enhances the quality of learned visual representations by up to 10% Top-1 accuracy in downstream tasks.
arXiv Detail & Related papers (2024-03-09T17:17:07Z)
A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. We present a generative latent variable model for self-supervised learning. We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z)
Improving Self-supervised Molecular Representation Learning using Persistent Homology [6.263470141349622]
Self-supervised learning (SSL) has great potential for molecular representation learning. In this paper, we study SSL based on persistent homology (PH), a mathematical tool for modeling topological features of data that persist across multiple scales.
arXiv Detail & Related papers (2023-11-29T02:58:30Z)
Enhancing Representations through Heterogeneous Self-Supervised Learning [61.40674648939691]
We propose Heterogeneous Self-Supervised Learning (HSSL), which enforces a base model to learn from an auxiliary head whose architecture is heterogeneous from the base model. The HSSL endows the base model with new characteristics in a representation learning way without structural changes. The HSSL is compatible with various self-supervised methods, achieving superior performances on various downstream tasks.
arXiv Detail & Related papers (2023-10-08T10:44:05Z)
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning [86.91893533388628]
This paper presents UniDiff, a unified multi-modal model that integrates image-text contrastive learning (ITC), text-conditioned image synthesis learning (IS), and reciprocal semantic consistency modeling (RSC) UniDiff demonstrates versatility in both multi-modal understanding and generative tasks.
arXiv Detail & Related papers (2023-06-01T15:39:38Z)
Weak Augmentation Guided Relational Self-Supervised Learning [80.0680103295137]
We introduce a novel relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Our proposed method employs sharpened distribution of pairwise similarities among different instances as textitrelation metric. Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures.
arXiv Detail & Related papers (2022-03-16T16:14:19Z)
Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples. We provide a unified review of different ways of training graph neural networks (GNNs) using SSL. Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.