Temperature as Uncertainty in Contrastive Learning
- URL: http://arxiv.org/abs/2110.04403v1
- Date: Fri, 8 Oct 2021 23:08:30 GMT
- Title: Temperature as Uncertainty in Contrastive Learning
- Authors: Oliver Zhang, Mike Wu, Jasmine Bayrooti, Noah Goodman
- Abstract summary: We propose a simple way to generate uncertainty scores for contrastive methods by re-purposing temperature.
We call this approach "Temperature as Uncertainty", or TaU.
In summary, TaU is a simple yet versatile method for generating uncertainties for contrastive learning.
- Score: 5.8927489390473164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning has demonstrated great capability to learn
representations without annotations, even outperforming supervised baselines.
However, it still lacks important properties useful for real-world application,
one of which is uncertainty. In this paper, we propose a simple way to generate
uncertainty scores for many contrastive methods by re-purposing temperature, a
mysterious hyperparameter used for scaling. By observing that temperature
controls how sensitive the objective is to specific embedding locations, we aim
to learn temperature as an input-dependent variable, treating it as a measure
of embedding confidence. We call this approach "Temperature as Uncertainty", or
TaU. Through experiments, we demonstrate that TaU is useful for
out-of-distribution detection, while remaining competitive with benchmarks on
linear evaluation. Moreover, we show that TaU can be learned on top of
pretrained models, enabling uncertainty scores to be generated post-hoc with
popular off-the-shelf models. In summary, TaU is a simple yet versatile method
for generating uncertainties for contrastive learning. Open source code can be
found at: https://github.com/mhw32/temperature-as-uncertainty-public.
Related papers
- Temperature-Free Loss Function for Contrastive Learning [7.229820415732795]
We propose a novel method to deploy InfoNCE loss without temperature.
Specifically, we replace temperature scaling with the inverse hyperbolic tangent function, resulting in a modified InfoNCE loss.
The proposed method was validated on five benchmarks on contrastive learning, yielding satisfactory results without temperature tuning.
arXiv Detail & Related papers (2025-01-29T14:43:21Z) - Logit Standardization in Knowledge Distillation [83.31794439964033]
The assumption of a shared temperature between teacher and student implies a mandatory exact match between their logits in terms of logit range and variance.
We propose setting the temperature as the weighted standard deviation of logit and performing a plug-and-play Z-score pre-process of logit standardization.
Our pre-process enables student to focus on essential logit relations from teacher rather than requiring a magnitude match, and can improve the performance of existing logit-based distillation methods.
arXiv Detail & Related papers (2024-03-03T07:54:03Z) - Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Improving Training and Inference of Face Recognition Models via Random
Temperature Scaling [45.33976405587231]
Random Temperature Scaling (RTS) is proposed to learn a reliable face recognition algorithm.
RTS can achieve top performance on both the face recognition and out-of-distribution detection tasks.
The proposed module is light-weight and only adds negligible cost to the model.
arXiv Detail & Related papers (2022-12-02T08:00:03Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Learning Uncertainty For Safety-Oriented Semantic Segmentation In
Autonomous Driving [77.39239190539871]
We show how uncertainty estimation can be leveraged to enable safety critical image segmentation in autonomous driving.
We introduce a new uncertainty measure based on disagreeing predictions as measured by a dissimilarity function.
We show experimentally that our proposed approach is much less computationally intensive at inference time than competing methods.
arXiv Detail & Related papers (2021-05-28T09:23:05Z) - Deep Learning based Uncertainty Decomposition for Real-time Control [9.067368638784355]
We propose a novel method for detecting the absence of training data using deep learning.
We show its advantages over existing approaches on synthetic and real-world datasets.
We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter.
arXiv Detail & Related papers (2020-10-06T10:46:27Z) - Data Uncertainty Learning in Face Recognition [23.74716810099911]
Uncertainty is important for noisy images, but seldom explored for face recognition.
It is unclear how uncertainty affects feature learning.
This work applies data uncertainty learning to face recognition.
arXiv Detail & Related papers (2020-03-25T11:40:38Z) - Binary Classification from Positive Data with Skewed Confidence [85.18941440826309]
Positive-confidence (Pconf) classification is a promising weakly-supervised learning method.
In practice, the confidence may be skewed by bias arising in an annotation process.
We introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyper parameter.
arXiv Detail & Related papers (2020-01-29T00:04:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.