Probabilistic Self-supervised Learning via Scoring Rules Minimization
- URL: http://arxiv.org/abs/2309.02048v1
- Date: Tue, 5 Sep 2023 08:48:25 GMT
- Title: Probabilistic Self-supervised Learning via Scoring Rules Minimization
- Authors: Amirhossein Vahidi, Simon Scho{\ss}er, Lisa Wimmer, Yawei Li, Bernd
Bischl, Eyke H\"ullermeier, Mina Rezaei
- Abstract summary: We propose a novel probabilistic self-supervised learning via Scoring Rule Minimization (ProSMIN) to enhance representation quality and mitigate collapsing representations.
Our method achieves superior accuracy and calibration, surpassing the self-supervised baseline in a wide range of experiments on large-scale datasets.
- Score: 19.347097627898876
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this paper, we propose a novel probabilistic self-supervised learning via
Scoring Rule Minimization (ProSMIN), which leverages the power of probabilistic
models to enhance representation quality and mitigate collapsing
representations. Our proposed approach involves two neural networks; the online
network and the target network, which collaborate and learn the diverse
distribution of representations from each other through knowledge distillation.
By presenting the input samples in two augmented formats, the online network is
trained to predict the target network representation of the same sample under a
different augmented view. The two networks are trained via our new loss
function based on proper scoring rules. We provide a theoretical justification
for ProSMIN's convergence, demonstrating the strict propriety of its modified
scoring rule. This insight validates the method's optimization process and
contributes to its robustness and effectiveness in improving representation
quality. We evaluate our probabilistic model on various downstream tasks, such
as in-distribution generalization, out-of-distribution detection, dataset
corruption, low-shot learning, and transfer learning. Our method achieves
superior accuracy and calibration, surpassing the self-supervised baseline in a
wide range of experiments on large-scale datasets like ImageNet-O and
ImageNet-C, ProSMIN demonstrates its scalability and real-world applicability.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Cross-Inferential Networks for Source-free Unsupervised Domain
Adaptation [17.718392065388503]
We propose to explore a new method called cross-inferential networks (CIN)
Our main idea is that, when we adapt the network model to predict the sample labels from encoded features, we use these prediction results to construct new training samples with derived labels.
Our experimental results on benchmark datasets demonstrate that our proposed CIN approach can significantly improve the performance of source-free UDA.
arXiv Detail & Related papers (2023-06-29T14:04:24Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Contrastive Learning for Fair Representations [50.95604482330149]
Trained classification models can unintentionally lead to biased representations and predictions.
Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise.
We propose a method for mitigating bias by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations.
arXiv Detail & Related papers (2021-09-22T10:47:51Z) - Adversarial Training Reduces Information and Improves Transferability [81.59364510580738]
Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.
We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new trade-off between transferability of representations and accuracy on the source task.
arXiv Detail & Related papers (2020-07-22T08:30:16Z) - Calibrated Adversarial Refinement for Stochastic Semantic Segmentation [5.849736173068868]
We present a strategy for learning a calibrated predictive distribution over semantic maps, where the probability associated with each prediction reflects its ground truth correctness likelihood.
We demonstrate the versatility and robustness of the approach by achieving state-of-the-art results on the multigrader LIDC dataset and on a modified Cityscapes dataset with injected ambiguities.
We show that the core design can be adapted to other tasks requiring learning a calibrated predictive distribution by experimenting on a toy regression dataset.
arXiv Detail & Related papers (2020-06-23T16:39:59Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.