Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias
and Variance
- URL: http://arxiv.org/abs/2002.09963v1
- Date: Sun, 23 Feb 2020 18:24:04 GMT
- Title: Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias
and Variance
- Authors: Matthew Almeida, Wei Ding, Scott Crouter, Ping Chen
- Abstract summary: We investigate a new approach to handle inaccuracy and uncertainty in the training data labels.
Our method can reduce both bias and variance by estimating the pointwise label uncertainty of the training set.
- Score: 4.563176550691304
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The study of model bias and variance with respect to decision boundaries is
critically important in supervised classification. There is generally a
tradeoff between the two, as fine-tuning of the decision boundary of a
classification model to accommodate more boundary training samples (i.e.,
higher model complexity) may improve training accuracy (i.e., lower bias) but
hurt generalization against unseen data (i.e., higher variance). By focusing on
just classification boundary fine-tuning and model complexity, it is difficult
to reduce both bias and variance. To overcome this dilemma, we take a different
perspective and investigate a new approach to handle inaccuracy and uncertainty
in the training data labels, which are inevitable in many applications where
labels are conceptual and labeling is performed by human annotators. The
process of classification can be undermined by uncertainty in the labels of the
training data; extending a boundary to accommodate an inaccurately labeled
point will increase both bias and variance. Our novel method can reduce both
bias and variance by estimating the pointwise label uncertainty of the training
set and accordingly adjusting the training sample weights such that those
samples with high uncertainty are weighted down and those with low uncertainty
are weighted up. In this way, uncertain samples have a smaller contribution to
the objective function of the model's learning algorithm and exert less pull on
the decision boundary. In a real-world physical activity recognition case
study, the data presents many labeling challenges, and we show that this new
approach improves model performance and reduces model variance.
Related papers
- Learning Confidence Bounds for Classification with Imbalanced Data [42.690254618937196]
We propose a novel framework that leverages learning theory and concentration inequalities to overcome the shortcomings of traditional solutions.
Our method can effectively adapt to the varying degrees of imbalance across different classes, resulting in more robust and reliable classification outcomes.
arXiv Detail & Related papers (2024-07-16T16:02:27Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning [89.98353600316285]
We introduce uncertainty into the modeling process for pseudo-label sampling, taking into account that the model performance on the tailed classes varies over different training stages.
This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes.
Compared to other methods such as the baseline method FixMatch, UDTS achieves an increase in accuracy of at least approximately 5.26%, 1.75%, 9.96%, and 1.28% on the natural scene image datasets.
arXiv Detail & Related papers (2024-01-09T08:59:39Z) - Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples.
For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class.
We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z) - Semi-Supervised Deep Regression with Uncertainty Consistency and
Variational Model Ensembling via Bayesian Neural Networks [31.67508478764597]
We propose a novel approach to semi-supervised regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME)
Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression.
Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels.
arXiv Detail & Related papers (2023-02-15T10:40:51Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Post-hoc Uncertainty Learning using a Dirichlet Meta-Model [28.522673618527417]
We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities.
Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties.
We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications.
arXiv Detail & Related papers (2022-12-14T17:34:11Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Unsupervised Embedding Learning from Uncertainty Momentum Modeling [37.674449317054716]
We propose a novel solution to explicitly model and explore the uncertainty of the given unlabeled learning samples.
We leverage such uncertainty modeling momentum to the learning which is helpful to tackle the outliers.
arXiv Detail & Related papers (2021-07-19T14:06:19Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Identifying Wrongly Predicted Samples: A Method for Active Learning [6.976600214375139]
We propose a simple sample selection criterion that moves beyond uncertainty.
We show state-of-the-art results and better rates at identifying wrongly predicted samples.
arXiv Detail & Related papers (2020-10-14T09:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.