GUESS: Generative Uncertainty Ensemble for Self Supervision
- URL: http://arxiv.org/abs/2412.02896v1
- Date: Tue, 03 Dec 2024 22:59:23 GMT
- Title: GUESS: Generative Uncertainty Ensemble for Self Supervision
- Authors: Salman Mohamadi, Gianfranco Doretto, Donald A. Adjeroh,
- Abstract summary: Self-supervised learning (SSL) frameworks consist of pretext task, and loss function aiming to learn useful general features from unlabeled data.<n>Inattentive or deterministic enforcement of the invariance to any kind of data augmentation is generally not only inefficient, but also potentially detrimental to performance on the downstream tasks.<n>We introduce a new approach GUESS, a pseudo-whitening framework, composed of controlled uncertainty injection, a new architecture, and a new loss function.
- Score: 6.963971634605796
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning (SSL) frameworks consist of pretext task, and loss function aiming to learn useful general features from unlabeled data. The basic idea of most SSL baselines revolves around enforcing the invariance to a variety of data augmentations via the loss function. However, one main issue is that, inattentive or deterministic enforcement of the invariance to any kind of data augmentation is generally not only inefficient, but also potentially detrimental to performance on the downstream tasks. In this work, we investigate the issue from the viewpoint of uncertainty in invariance representation. Uncertainty representation is fairly under-explored in the design of SSL architectures as well as loss functions. We incorporate uncertainty representation in both loss function as well as architecture design aiming for more data-dependent invariance enforcement. The former is represented in the form of data-derived uncertainty in SSL loss function resulting in a generative-discriminative loss function. The latter is achieved by feeding slightly different distorted versions of samples to the ensemble aiming for learning better and more robust representation. Specifically, building upon the recent methods that use hard and soft whitening (a.k.a redundancy reduction), we introduce a new approach GUESS, a pseudo-whitening framework, composed of controlled uncertainty injection, a new architecture, and a new loss function. We include detailed results and ablation analysis establishing GUESS as a new baseline.
Related papers
- EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification [1.3778851745408134]
We propose a novel ensemble method, namely EnsLoss, to combine loss functions within the Empirical risk minimization framework.
We first transform the CC conditions of losses into loss-derivatives, thereby bypassing the need for explicit loss functions.
We theoretically establish the statistical consistency of our approach and provide insights into its benefits.
arXiv Detail & Related papers (2024-09-02T02:40:42Z) - Smooth Pseudo-Labeling [4.1569253650826195]
A fruitful method in Semi-Supervised Learning (SSL) is Pseudo-Labeling (PL)
PL suffers from the important drawback that the associated loss function has discontinuities in its derivatives, which cause instabilities in performance when labels are very scarce.
We introduce a new benchmark, where labeled images are selected randomly from the whole dataset, without imposing representation of each class proportional to its frequency in the dataset.
arXiv Detail & Related papers (2024-05-23T08:33:07Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - Understanding Collapse in Non-Contrastive Learning [122.2499276246997]
We show that SimSiam representations undergo partial dimensional collapse if the model is too small relative to the dataset size.
We propose a metric to measure the degree of this collapse and show that it can be used to forecast the downstream task performance without any fine-tuning or labels.
arXiv Detail & Related papers (2022-09-29T17:59:55Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - Ensemble of Loss Functions to Improve Generalizability of Deep Metric
Learning methods [0.609170287691728]
We propose novel approaches to combine different losses built on top of a shared deep feature extractor.
We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings.
arXiv Detail & Related papers (2021-07-02T15:19:46Z) - Continual Learning for Fake Audio Detection [62.54860236190694]
This paper proposes detecting fake without forgetting, a continual-learning-based method, to make the model learn new spoofing attacks incrementally.
Experiments are conducted on the ASVspoof 2019 dataset.
arXiv Detail & Related papers (2021-04-15T07:57:05Z) - Second-Moment Loss: A Novel Regression Objective for Improved
Uncertainties [7.766663822644739]
Quantification of uncertainty is one of the most promising approaches to establish safe machine learning.
One of the most commonly used approaches so far is Monte Carlo dropout, which is computationally cheap and easy to apply in practice.
We propose a new objective, referred to as second-moment loss ( UCI), to address this issue.
arXiv Detail & Related papers (2020-12-23T14:17:33Z) - Adaptive Weighted Discriminator for Training Generative Adversarial
Networks [11.68198403603969]
We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts.
Our method can be potentially applied to any discriminator model with a loss that is a sum of the real and fake parts.
arXiv Detail & Related papers (2020-12-05T23:55:42Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.