Going Beyond One-Hot Encoding in Classification: Can Human Uncertainty
Improve Model Performance?
- URL: http://arxiv.org/abs/2205.15265v1
- Date: Mon, 30 May 2022 17:19:11 GMT
- Title: Going Beyond One-Hot Encoding in Classification: Can Human Uncertainty
Improve Model Performance?
- Authors: Christoph Koller, G\"oran Kauermann, Xiao Xiang Zhu
- Abstract summary: We show that label uncertainty is explicitly embedded into the training process via distributional labels.
The incorporation of label uncertainty helps the model to generalize better to unseen data and increases model performance.
Similar to existing calibration methods, the distributional labels lead to better-calibrated probabilities, which in turn yield more certain and trustworthy predictions.
- Score: 14.610038284393166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Technological and computational advances continuously drive forward the broad
field of deep learning. In recent years, the derivation of quantities
describing theuncertainty in the prediction - which naturally accompanies the
modeling process - has sparked general interest in the deep learning community.
Often neglected in the machine learning setting is the human uncertainty that
influences numerous labeling processes. As the core of this work, label
uncertainty is explicitly embedded into the training process via distributional
labels. We demonstrate the effectiveness of our approach on image
classification with a remote sensing data set that contains multiple label
votes by domain experts for each image: The incorporation of label uncertainty
helps the model to generalize better to unseen data and increases model
performance. Similar to existing calibration methods, the distributional labels
lead to better-calibrated probabilities, which in turn yield more certain and
trustworthy predictions.
Related papers
- Learning with Confidence: Training Better Classifiers from Soft Labels [0.0]
In supervised machine learning, models are typically trained using data with hard labels, i.e., definite assignments of class membership.
We investigate whether incorporating label uncertainty, represented as discrete probability distributions over the class labels, improves the predictive performance of classification models.
arXiv Detail & Related papers (2024-09-24T13:12:29Z) - Uncertainty-aware self-training with expectation maximization basis transformation [9.7527450662978]
We propose a new self-training framework to combine uncertainty information of both model and dataset.
Specifically, we propose to use Expectation-Maximization (EM) to smooth the labels and comprehensively estimate the uncertainty information.
arXiv Detail & Related papers (2024-05-02T11:01:31Z) - Probabilistic Test-Time Generalization by Variational Neighbor-Labeling [62.158807685159736]
This paper strives for domain generalization, where models are trained exclusively on source domains before being deployed on unseen target domains.
Probability pseudo-labeling of target samples to generalize the source-trained model to the target domain at test time.
Variational neighbor labels that incorporate the information of neighboring target samples to generate more robust pseudo labels.
arXiv Detail & Related papers (2023-07-08T18:58:08Z) - Semi-Supervised Deep Regression with Uncertainty Consistency and
Variational Model Ensembling via Bayesian Neural Networks [31.67508478764597]
We propose a novel approach to semi-supervised regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME)
Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression.
Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels.
arXiv Detail & Related papers (2023-02-15T10:40:51Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Uncertainty-aware Label Distribution Learning for Facial Expression
Recognition [13.321770808076398]
We propose a new uncertainty-aware label distribution learning method to improve the robustness of deep models against uncertainty and ambiguity.
Our method can be easily integrated into a deep network to obtain more training supervision and improve recognition accuracy.
arXiv Detail & Related papers (2022-09-21T15:48:41Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z) - Credal Self-Supervised Learning [0.0]
We show how to let the learner generate "pseudo-supervision" for unlabeled instances.
In combination with consistency regularization, pseudo-labeling has shown promising performance in various domains.
We compare our methodology to state-of-the-art self-supervision approaches.
arXiv Detail & Related papers (2021-06-22T15:19:04Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z) - Out-distribution aware Self-training in an Open World Setting [62.19882458285749]
We leverage unlabeled data in an open world setting to further improve prediction performance.
We introduce out-distribution aware self-training, which includes a careful sample selection strategy.
Our classifiers are by design out-distribution aware and can thus distinguish task-related inputs from unrelated ones.
arXiv Detail & Related papers (2020-12-21T12:25:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.