Related papers: ProBoost: a Boosting Method for Probabilistic Classifiers

ProBoost: a Boosting Method for Probabilistic Classifiers

URL: http://arxiv.org/abs/2209.01611v1
Date: Sun, 4 Sep 2022 12:49:20 GMT
Title: ProBoost: a Boosting Method for Probabilistic Classifiers
Authors: F\'abio Mendon\c{c}a, Sheikh Shanawaz Mostafa, Fernando Morgado-Dias, Antonio G. Ravelo-Garc\'ia, and M\'ario A. T. Figueiredo
Abstract summary: ProBoost is a new boosting algorithm for probabilistic classifiers. It uses the uncertainty of each training sample to determine the most challenging/uncertain ones. It produces a sequence that progressively focuses on the samples found to have the highest uncertainty.
Score: 55.970609838687864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: ProBoost, a new boosting algorithm for probabilistic classifiers, is proposed in this work. This algorithm uses the epistemic uncertainty of each training sample to determine the most challenging/uncertain ones; the relevance of these samples is then increased for the next weak learner, producing a sequence that progressively focuses on the samples found to have the highest uncertainty. In the end, the weak learners' outputs are combined into a weighted ensemble of classifiers. Three methods are proposed to manipulate the training set: undersampling, oversampling, and weighting the training samples according to the uncertainty estimated by the weak learners. Furthermore, two approaches are studied regarding the ensemble combination. The weak learner herein considered is a standard convolutional neural network, and the probabilistic models underlying the uncertainty estimation use either variational inference or Monte Carlo dropout. The experimental evaluation carried out on MNIST benchmark datasets shows that ProBoost yields a significant performance improvement. The results are further highlighted by assessing the relative achievable improvement, a metric proposed in this work, which shows that a model with only four weak learners leads to an improvement exceeding 12% in this metric (for either accuracy, sensitivity, or specificity), in comparison to the model learned without ProBoost.

Related papers

TRUST: Test-time Resource Utilization for Superior Trustworthiness [15.031121920821109]
We propose a novel test-time optimization method that accounts for the impact of such noise to produce more reliable confidence estimates.<n>This score defines a monotonic subset-selection function, where population accuracy consistently increases as samples with lower scores are removed.
arXiv Detail & Related papers (2025-06-06T12:52:32Z)
Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation [19.145735532822012]
We show that the canonical randomized split of a test set in conventional evaluation leaves the test set dominated by samples with high similarity to the training set. We propose a framework of similarity aware evaluation in which a novel split methodology is proposed to adapt to any desired distribution. Results demonstrate that the proposed split methodology can significantly better fit desired distributions and guide the development of models.
arXiv Detail & Related papers (2025-04-13T08:30:57Z)
CVOCSemRPL: Class-Variance Optimized Clustering, Semantic Information Injection and Restricted Pseudo Labeling based Improved Semi-Supervised Few-Shot Learning [4.3149314441871205]
Unlabeled samples are generally cheaper to obtain and can be used to improve the few-shot learning performance of the model. We propose an approach for semi-supervised few-shot learning that performs a class-variance optimized clustering. We experimentally demonstrate that our proposed approach significantly outperforms recent state-of-the-art methods on the benchmark datasets.
arXiv Detail & Related papers (2025-01-24T11:14:35Z)
Sample-Efficient Agnostic Boosting [19.15484761265653]
Empirical Risk Minimization (ERM) outstrips the agnostic boosting methodology in being quadratically more sample efficient than all known boosting algorithms. A key feature of our algorithm is that it leverages the ability to reuse samples across multiple rounds of boosting, while guaranteeing a generalization error strictly better than those obtained by blackbox applications of uniform convergence arguments.
arXiv Detail & Related papers (2024-10-31T04:50:29Z)
Measuring training variability from stochastic optimization using robust nonparametric testing [5.519968037738177]
We propose a robust hypothesis testing framework and a novel summary statistic, the $alpha$-trimming level, to measure model similarity. Applying hypothesis testing directly with the $alpha$-trimming level is challenging because we cannot accurately describe the distribution under the null hypothesis. We show how to use the $alpha$-trimming level to measure model variability and demonstrate experimentally that it is more expressive than performance metrics.
arXiv Detail & Related papers (2024-06-12T15:08:15Z)
Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning [6.704927458661697]
Expected Loss Reduction (ELR) focuses on a Bayesian estimate of the reduction in classification error, and more general costs fit in the same framework. We propose Bayesian Estimate of Mean Proper Scores (BEMPS) to estimate the increase in strictly proper scores. We show that BEMPS yields robust acquisition functions and well-calibrated classifiers, and consistently outperforms the others tested.
arXiv Detail & Related papers (2023-12-15T11:02:17Z)
Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples. For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class. We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z)
Confidence-aware Training of Smoothed Classifiers for Certified Robustness [75.95332266383417]
We use "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input. Our experiments show that the proposed method consistently exhibits improved certified robustness upon state-of-the-art training methods.
arXiv Detail & Related papers (2022-12-18T03:57:12Z)
Pixel is All You Need: Adversarial Trajectory-Ensemble Active Learning for Salient Object Detection [40.97103355628434]
It is unclear whether a saliency model trained with weakly-supervised data can achieve the equivalent performance of its fully-supervised version. We propose a novel yet effective adversarial trajectory-ensemble active learning (ATAL) Experimental results show that our ATAL can find such a point-labeled dataset, where a saliency model trained on it obtained $97%$ -- $99%$ performance of its fully-supervised version with only ten annotated points per image.
arXiv Detail & Related papers (2022-12-13T11:18:08Z)
Uncertainty Estimation for Language Reward Models [5.33024001730262]
Language models can learn a range of capabilities from unsupervised training on text corpora. It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons. We seek to address these problems via uncertainty estimation, which can improve sample efficiency and robustness using active learning and risk-averse reinforcement learning.
arXiv Detail & Related papers (2022-03-14T20:13:21Z)
SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness [61.212486108346695]
We propose a training scheme, coined SmoothMix, to control the robustness of smoothed classifiers via self-mixup. The proposed procedure effectively identifies over-confident, near off-class samples as a cause of limited robustness. Our experimental results demonstrate that the proposed method can significantly improve the certified $ell$-robustness of smoothed classifiers.
arXiv Detail & Related papers (2021-11-17T18:20:59Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
Efficient Ensemble Model Generation for Uncertainty Estimation with Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models. In the proposed method, ensemble models can be efficiently generated by using the layer selection method. We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z)
Minority Class Oversampling for Tabular Data with Deep Generative Models [4.976007156860967]
We study the ability of deep generative models to provide realistic samples that improve performance on imbalanced classification tasks via oversampling. Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely. We also observe that the improvements in terms of performance metric, while shown to be significant, often are minor in absolute terms.
arXiv Detail & Related papers (2020-05-07T21:35:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.