Self-Supervised Learning with Gaussian Processes
- URL: http://arxiv.org/abs/2512.09322v1
- Date: Wed, 10 Dec 2025 05:10:40 GMT
- Title: Self-Supervised Learning with Gaussian Processes
- Authors: Yunshan Duan, Sinead Williamson,
- Abstract summary: Self supervised learning (SSL) is a machine learning paradigm where models learn to understand the underlying structure of data without explicit supervision from labeled samples.<n>To ensure smoothness of the representation space, most SSL methods rely on the ability to generate pairs of observations that are similar to a given instance.<n>We show that GPSSL is closely related to both kernel PCA and VICReg, a popular neural network-based SSL method, but unlike both allows for posterior uncertainties that can be propagated to downstream tasks.
- Score: 0.9058737915650011
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Self supervised learning (SSL) is a machine learning paradigm where models learn to understand the underlying structure of data without explicit supervision from labeled samples. The acquired representations from SSL have demonstrated useful for many downstream tasks including clustering, and linear classification, etc. To ensure smoothness of the representation space, most SSL methods rely on the ability to generate pairs of observations that are similar to a given instance. However, generating these pairs may be challenging for many types of data. Moreover, these methods lack consideration of uncertainty quantification and can perform poorly in out-of-sample prediction settings. To address these limitations, we propose Gaussian process self supervised learning (GPSSL), a novel approach that utilizes Gaussian processes (GP) models on representation learning. GP priors are imposed on the representations, and we obtain a generalized Bayesian posterior minimizing a loss function that encourages informative representations. The covariance function inherent in GPs naturally pulls representations of similar units together, serving as an alternative to using explicitly defined positive samples. We show that GPSSL is closely related to both kernel PCA and VICReg, a popular neural network-based SSL method, but unlike both allows for posterior uncertainties that can be propagated to downstream tasks. Experiments on various datasets, considering classification and regression tasks, demonstrate that GPSSL outperforms traditional methods in terms of accuracy, uncertainty quantification, and error control.
Related papers
- In-Context Function Learning in Large Language Models [19.618773481188626]
Large language models (LLMs) can learn from a few demonstrations provided at inference time.<n>We study this in-context learning phenomenon through the lens of Gaussian Processes (GPs)<n>We find that LLM learning curves are strongly influenced by the function-generating kernels and approach the GP lower bound as the number of demonstrations increases.
arXiv Detail & Related papers (2026-02-12T12:09:48Z) - A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Understanding Representation Learnability of Nonlinear Self-Supervised
Learning [13.965135660149212]
Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks.
Our paper is the first to analyze the learning results of the nonlinear SSL model accurately.
arXiv Detail & Related papers (2024-01-06T13:23:26Z) - Semi-Supervised Class-Agnostic Motion Prediction with Pseudo Label
Regeneration and BEVMix [59.55173022987071]
We study the potential of semi-supervised learning for class-agnostic motion prediction.
Our framework adopts a consistency-based self-training paradigm, enabling the model to learn from unlabeled data.
Our method exhibits comparable performance to weakly and some fully supervised methods.
arXiv Detail & Related papers (2023-12-13T09:32:50Z) - Bridging the Projection Gap: Overcoming Projection Bias Through Parameterized Distance Learning [9.26015904497319]
Generalized zero-shot learning (GZSL) aims to recognize samples from both seen and unseen classes using only seen class samples for training.
GZSL methods are prone to bias towards seen classes during inference due to the projection function being learned from seen classes.
We address this projection bias by proposing to learn a parameterized Mahalanobis distance metric for robust inference.
arXiv Detail & Related papers (2023-09-04T06:41:29Z) - Benchmark for Uncertainty & Robustness in Self-Supervised Learning [0.0]
Self-Supervised Learning is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars.
In this paper, we explore variants of SSL methods, including Jigsaw Puzzles, Context, Rotation, Geometric Transformations Prediction for vision, as well as BERT and GPT for language tasks.
Our goal is to create a benchmark with outputs from experiments, providing a starting point for new SSL methods in Reliable Machine Learning.
arXiv Detail & Related papers (2022-12-23T15:46:23Z) - Self-Supervised PPG Representation Learning Shows High Inter-Subject
Variability [3.8036939971290007]
We propose a Self-Supervised Learning (SSL) method with a pretext task of signal reconstruction to learn an informative generalized PPG representation.
Results show that in a very limited label data setting (10 samples per class or less), using SSL is beneficial.
SSL may pave the way for the broader use of machine learning models on PPG data in label-scarce regimes.
arXiv Detail & Related papers (2022-12-07T19:02:45Z) - Improving Self-Supervised Learning by Characterizing Idealized
Representations [155.1457170539049]
We prove necessary and sufficient conditions for any task invariant to given data augmentations.
For contrastive learning, our framework prescribes simple but significant improvements to previous methods.
For non-contrastive learning, we use our framework to derive a simple and novel objective.
arXiv Detail & Related papers (2022-09-13T18:01:03Z) - Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples.
We provide a unified review of different ways of training graph neural networks (GNNs) using SSL.
Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z) - On Data-Augmentation and Consistency-Based Semi-Supervised Learning [77.57285768500225]
Recently proposed consistency-based Semi-Supervised Learning (SSL) methods have advanced the state of the art in several SSL tasks.
Despite these advances, the understanding of these methods is still relatively limited.
arXiv Detail & Related papers (2021-01-18T10:12:31Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.