Probabilistic Contrastive Loss for Self-Supervised Learning
- URL: http://arxiv.org/abs/2112.01642v1
- Date: Thu, 2 Dec 2021 23:41:52 GMT
- Title: Probabilistic Contrastive Loss for Self-Supervised Learning
- Authors: Shen Li, Jianqing Xu, Bryan Hooi
- Abstract summary: This paper proposes a probabilistic contrastive loss function for self-supervised learning.
Some intriguing properties of the proposed loss function are empirically demonstrated, which agree with human-like predictions.
- Score: 25.097498223895016
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes a probabilistic contrastive loss function for
self-supervised learning. The well-known contrastive loss is deterministic and
involves a temperature hyperparameter that scales the inner product between two
normed feature embeddings. By reinterpreting the temperature hyperparameter as
a quantity related to the radius of the hypersphere, we derive a new loss
function that involves a confidence measure which quantifies uncertainty in a
mathematically grounding manner. Some intriguing properties of the proposed
loss function are empirically demonstrated, which agree with human-like
predictions. We believe the present work brings up a new prospective to the
area of contrastive learning.
Related papers
- Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head [38.898038672237746]
We introduce a logit-level loss function as a supplement to the widely used probability-level loss function.
We find that the amalgamation of the newly introduced logit-level loss and the previous probability-level loss will lead to performance degeneration.
We propose a novel method called dual-head knowledge distillation, which partitions the linear classifier into two classification heads responsible for different losses.
arXiv Detail & Related papers (2024-11-13T12:33:04Z) - Large Margin Discriminative Loss for Classification [3.3975558777609915]
We introduce a novel discriminative loss function with large margin in the context of Deep Learning.
This loss boosts the discriminative power of neural nets, represented by intra-class compactness and inter-class separability.
arXiv Detail & Related papers (2024-05-28T18:10:45Z) - Uncertainty-boosted Robust Video Activity Anticipation [72.14155465769201]
Video activity anticipation aims to predict what will happen in the future, embracing a broad application prospect ranging from robot vision to autonomous driving.
Despite the recent progress, the data uncertainty issue, reflected as the content evolution process and dynamic correlation in event labels, has been somehow ignored.
We propose an uncertainty-boosted robust video activity anticipation framework, which generates uncertainty values to indicate the credibility of the anticipation results.
arXiv Detail & Related papers (2024-04-29T12:31:38Z) - Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation
for Earth System Science Applications [0.32302664881848275]
Evidential deep learning is a technique that extends parametric deep learning to higher-order distributions.
This study compares the uncertainty derived from evidential neural networks to those obtained from ensembles.
We show evidential deep learning models attaining predictive accuracy rivaling standard methods, while robustly quantifying both sources of uncertainty.
arXiv Detail & Related papers (2023-09-22T23:04:51Z) - Causal inference for the expected number of recurrent events in the
presence of a terminal event [0.0]
We study causal inference and efficient estimation for the expected number of recurrent events in the presence of a terminal event.
No absolute continuity assumption is made on the underlying probability distributions of failure, censoring, or the observed data.
arXiv Detail & Related papers (2023-06-28T21:31:25Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - On Second-Order Scoring Rules for Epistemic Uncertainty Quantification [8.298716599039501]
We show that there seems to be no loss function that provides an incentive for a second-order learner to faithfully represent its uncertainty.
As a main mathematical tool to prove this result, we introduce the generalised notion of second-order scoring rules.
arXiv Detail & Related papers (2023-01-30T08:59:45Z) - Monotonicity and Double Descent in Uncertainty Estimation with Gaussian
Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.
We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension.
We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - On the Difficulty of Epistemic Uncertainty Quantification in Machine
Learning: The Case of Direct Uncertainty Estimation through Loss Minimisation [8.298716599039501]
Uncertainty quantification has received increasing attention in machine learning.
The latter refers to the learner's (lack of) knowledge and appears to be especially difficult to measure and quantify.
We show that loss minimisation does not work for second-order predictors.
arXiv Detail & Related papers (2022-03-11T17:26:05Z) - Causal Inference Under Unmeasured Confounding With Negative Controls: A
Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available.
Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z) - DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner.
We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z) - Mixability of Integral Losses: a Key to Efficient Online Aggregation of Functional and Probabilistic Forecasts [72.32459441619388]
We adapt basic mixable (and exponentially concave) loss functions to compare functional predictions and prove that these adaptations are also mixable (exp-concave)
As an application of our main result, we prove that various loss functions used for probabilistic forecasting are mixable (exp-concave)
arXiv Detail & Related papers (2019-12-15T14:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.