A Bayesian Unification of Self-Supervised Clustering and Energy-Based
Models
- URL: http://arxiv.org/abs/2401.00873v2
- Date: Mon, 4 Mar 2024 09:24:35 GMT
- Title: A Bayesian Unification of Self-Supervised Clustering and Energy-Based
Models
- Authors: Emanuele Sansone and Robin Manhaeve
- Abstract summary: We perform a Bayesian analysis of state-of-the-art self-supervised learning objectives.
We show that our objective function allows to outperform existing self-supervised learning strategies.
We also demonstrate that GEDI can be integrated into a neuro-symbolic framework.
- Score: 11.007541337967027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning is a popular and powerful method for utilizing large
amounts of unlabeled data, for which a wide variety of training objectives have
been proposed in the literature. In this study, we perform a Bayesian analysis
of state-of-the-art self-supervised learning objectives, elucidating the
underlying probabilistic graphical models in each class and presenting a
standardized methodology for their derivation from first principles. The
analysis also indicates a natural means of integrating self-supervised learning
with likelihood-based generative models. We instantiate this concept within the
realm of cluster-based self-supervised learning and energy models, introducing
a novel lower bound which is proven to reliably penalize the most important
failure modes. Furthermore, this newly proposed lower bound enables the
training of a standard backbone architecture without the necessity for
asymmetric elements such as stop gradients, momentum encoders, or specialized
clustering layers - typically introduced to avoid learning trivial solutions.
Our theoretical findings are substantiated through experiments on synthetic and
real-world data, including SVHN, CIFAR10, and CIFAR100, thus showing that our
objective function allows to outperform existing self-supervised learning
strategies in terms of clustering, generation and out-of-distribution detection
performance by a wide margin. We also demonstrate that GEDI can be integrated
into a neuro-symbolic framework to mitigate the reasoning shortcut problem and
to learn higher quality symbolic representations thanks to the enhanced
classification performance.
Related papers
- Exploring the Precise Dynamics of Single-Layer GAN Models: Leveraging Multi-Feature Discriminators for High-Dimensional Subspace Learning [0.0]
We study the training dynamics of a single-layer GAN model from the perspective of subspace learning.
By bridging our analysis to the realm of subspace learning, we systematically compare the efficacy of GAN-based methods against conventional approaches.
arXiv Detail & Related papers (2024-11-01T10:21:12Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - A Novel Neural-symbolic System under Statistical Relational Learning [50.747658038910565]
We propose a general bi-level probabilistic graphical reasoning framework called GBPGR.
In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models.
Our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.
arXiv Detail & Related papers (2023-09-16T09:15:37Z) - Semi-supervised learning made simple with self-supervised clustering [65.98152950607707]
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations.
We propose a conceptually simple yet empirically powerful approach to turn clustering-based self-supervised methods into semi-supervised learners.
arXiv Detail & Related papers (2023-06-13T01:09:18Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Learning Symbolic Representations Through Joint GEnerative and
DIscriminative Training [3.6804038214708563]
GEDI is a Bayesian framework that combines self-supervised learning objectives with likelihood-based generative models.
We demonstrate GEDI outperforms existing self-supervised learning strategies in terms of clustering performance by a significant margin.
arXiv Detail & Related papers (2023-04-22T09:35:51Z) - GEDI: GEnerative and DIscriminative Training for Self-Supervised
Learning [3.6804038214708563]
We study state-of-the-art self-supervised learning objectives and propose a unified formulation based on likelihood learning.
We refer to this combined framework as GEDI, which stands for GEnerative and DIscriminative training.
We show that GEDI outperforms existing self-supervised learning strategies in terms of clustering performance by a wide margin.
arXiv Detail & Related papers (2022-12-27T09:33:50Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - A Unified Contrastive Energy-based Model for Understanding the
Generative Ability of Adversarial Training [64.71254710803368]
Adversarial Training (AT) is an effective approach to enhance the robustness of deep neural networks.
We demystify this phenomenon by developing a unified probabilistic framework, called Contrastive Energy-based Models (CEM)
We propose a principled method to develop adversarial learning and sampling methods.
arXiv Detail & Related papers (2022-03-25T05:33:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.