Probabilistic Contrastive Learning with Explicit Concentration on the Hypersphere
- URL: http://arxiv.org/abs/2405.16460v1
- Date: Sun, 26 May 2024 07:08:13 GMT
- Title: Probabilistic Contrastive Learning with Explicit Concentration on the Hypersphere
- Authors: Hongwei Bran Li, Cheng Ouyang, Tamaz Amiranashvili, Matthew S. Rosen, Bjoern Menze, Juan Eugenio Iglesias,
- Abstract summary: This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space.
We leverage the concentration parameter, kappa, as a direct, interpretable measure to quantify uncertainty explicitly.
- Score: 3.572499139455308
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Self-supervised contrastive learning has predominantly adopted deterministic methods, which are not suited for environments characterized by uncertainty and noise. This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space, inspired by the von Mises-Fisher distribution (vMF). We introduce an unnormalized form of vMF and leverage the concentration parameter, kappa, as a direct, interpretable measure to quantify uncertainty explicitly. This approach not only provides a probabilistic interpretation of the embedding space but also offers a method to calibrate model confidence against varying levels of data corruption and characteristics. Our empirical results demonstrate that the estimated concentration parameter correlates strongly with the degree of unforeseen data corruption encountered at test time, enables failure analysis, and enhances existing out-of-distribution detection methods.
Related papers
- Combining Statistical Depth and Fermat Distance for Uncertainty Quantification [3.3975558777609915]
We measure the Out-of-domain uncertainty in the prediction of Neural Networks using a statistical notion called Lens Depth'' (LD) combined with Fermat Distance.
The proposed method gives excellent qualitative result on toy datasets and can give competitive or better uncertainty estimation on standard deep learning datasets.
arXiv Detail & Related papers (2024-04-12T13:54:21Z) - One step closer to unbiased aleatoric uncertainty estimation [71.55174353766289]
We propose a new estimation method by actively de-noising the observed data.
By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.
arXiv Detail & Related papers (2023-12-16T14:59:11Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity.
The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z) - A Data-Driven Measure of Relative Uncertainty for Misclassification
Detection [25.947610541430013]
We introduce a data-driven measure of uncertainty relative to an observer for misclassification detection.
By learning patterns in the distribution of soft-predictions, our uncertainty measure can identify misclassified samples.
We demonstrate empirical improvements over multiple image classification tasks, outperforming state-of-the-art misclassification detection methods.
arXiv Detail & Related papers (2023-06-02T17:32:03Z) - Integrating Uncertainty into Neural Network-based Speech Enhancement [27.868722093985006]
Supervised masking approaches in the time-frequency domain aim to employ deep neural networks to estimate a multiplicative mask to extract clean speech.
This leads to a single estimate for each input without any guarantees or measures of reliability.
We study the benefits of modeling uncertainty in clean speech estimation.
arXiv Detail & Related papers (2023-05-15T15:55:12Z) - Decomposing Representations for Deterministic Uncertainty Estimation [34.11413246048065]
We show that current feature density based uncertainty estimators cannot perform well consistently across different OoD detection settings.
We propose to decompose the learned representations and integrate the uncertainties estimated on them separately.
arXiv Detail & Related papers (2021-12-01T22:12:01Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner.
We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.