Taming Overconfident Prediction on Unlabeled Data from Hindsight
- URL: http://arxiv.org/abs/2112.08200v1
- Date: Wed, 15 Dec 2021 15:17:02 GMT
- Title: Taming Overconfident Prediction on Unlabeled Data from Hindsight
- Authors: Jing Li, Yuangang Pan, Ivor W. Tsang
- Abstract summary: Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning.
This paper proposes a dual mechanism, named ADaptive Sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions.
ADS significantly improves the state-of-the-art SSL methods by making it a plug-in.
- Score: 50.9088560433925
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Minimizing prediction uncertainty on unlabeled data is a key factor to
achieve good performance in semi-supervised learning (SSL). The prediction
uncertainty is typically expressed as the \emph{entropy} computed by the
transformed probabilities in output space. Most existing works distill
low-entropy prediction by either accepting the determining class (with the
largest probability) as the true label or suppressing subtle predictions (with
the smaller probabilities). Unarguably, these distillation strategies are
usually heuristic and less informative for model training. From this
discernment, this paper proposes a dual mechanism, named ADaptive Sharpening
(\ADS), which first applies a soft-threshold to adaptively mask out determinate
and negligible predictions, and then seamlessly sharpens the informed
predictions, distilling certain predictions with the informed ones only. More
importantly, we theoretically analyze the traits of \ADS by comparing with
various distillation strategies. Numerous experiments verify that \ADS
significantly improves the state-of-the-art SSL methods by making it a plug-in.
Our proposed \ADS forges a cornerstone for future distillation-based SSL
research.
Related papers
- Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.
Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.
We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Augmented prediction of a true class for Positive Unlabeled data under selection bias [0.8594140167290099]
We introduce a new observational setting for Positive Unlabeled (PU) data where the observations at prediction time are also labeled.
We argue that the additional information is important for prediction, and call this task "augmented PU prediction"
We introduce several variants of the empirical Bayes rule in such scenario and investigate their performance.
arXiv Detail & Related papers (2024-07-14T19:58:01Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Do not trust what you trust: Miscalibration in Semi-supervised Learning [21.20806568508201]
State-of-the-art semi-supervised learning (SSL) approaches rely on highly confident predictions to serve as pseudo-labels that guide the training on unlabeled samples.
We show that SSL methods based on pseudo-labels are significantly miscalibrated, and formally demonstrate the minimization of the min-entropy.
We integrate a simple penalty term, which enforces the logit of the predictions on unlabeled samples to remain low, preventing the network predictions to become overconfident.
arXiv Detail & Related papers (2024-03-22T18:43:46Z) - Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample.
Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications.
We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z) - Conformal Prediction for Deep Classifier via Label Ranking [29.784336674173616]
Conformal prediction is a statistical framework that generates prediction sets with a desired coverage guarantee.
We propose a novel algorithm named $textitSorted Adaptive Prediction Sets$ (SAPS)
SAPS discards all the probability values except for the maximum softmax probability.
arXiv Detail & Related papers (2023-10-10T08:54:14Z) - LMD: Light-weight Prediction Quality Estimation for Object Detection in
Lidar Point Clouds [3.927702899922668]
Object detection on Lidar point cloud data is a promising technology for autonomous driving and robotics.
Uncertainty estimation is a crucial component for down-stream tasks and deep neural networks remain error-prone even for predictions with high confidence.
We propose LidarMetaDetect, a light-weight post-processing scheme for prediction quality estimation.
Our experiments show a significant increase of statistical reliability in separating true from false predictions.
arXiv Detail & Related papers (2023-06-13T15:13:29Z) - ADT-SSL: Adaptive Dual-Threshold for Semi-Supervised Learning [68.53717108812297]
Semi-Supervised Learning (SSL) has advanced classification tasks by inputting both labeled and unlabeled data to train a model jointly.
This paper proposes an Adaptive Dual-Threshold method for Semi-Supervised Learning (ADT-SSL)
Experimental results show that the proposed ADT-SSL achieves state-of-the-art classification accuracy.
arXiv Detail & Related papers (2022-05-21T11:52:08Z) - Multi-label Chaining with Imprecise Probabilities [0.0]
We present two different strategies to extend the classical multi-label chaining approach to handle imprecise probability estimates.
The main reasons one could have for using such estimations are (1) to make cautious predictions when a high uncertainty is detected in the chaining and (2) to make better precise predictions by avoiding biases caused in early decisions in the chaining.
Our experimental results on missing labels, which investigate how reliable these predictions are in both approaches, indicate that our approaches produce relevant cautiousness on those hard-to-predict instances where the precise models fail.
arXiv Detail & Related papers (2021-07-15T16:43:31Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.