Related papers: Subsampled Ensemble Can Improve Generalization Tail Exponentially

Subsampled Ensemble Can Improve Generalization Tail Exponentially

URL: http://arxiv.org/abs/2405.14741v3
Date: Thu, 03 Oct 2024 19:23:09 GMT
Title: Subsampled Ensemble Can Improve Generalization Tail Exponentially
Authors: Huajie Qian, Donghao Ying, Henry Lam, Wotao Yin,
Abstract summary: Ensemble learning is a popular technique to improve the accuracy of machine learning models. We provide a new perspective on ensembling by selecting the best model trained on subsamples via majority voting. We demonstrate how our ensemble methods can substantially improve out-of-sample performances in a range of examples involving heavy-tailed data or intrinsically slow rates.
Score: 27.941595142117443
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Ensemble learning is a popular technique to improve the accuracy of machine learning models. It hinges on the rationale that aggregating multiple weak models can lead to better models with lower variance and hence higher stability, especially for discontinuous base learners. In this paper, we provide a new perspective on ensembling. By selecting the best model trained on subsamples via majority voting, we can attain exponentially decaying tails for the excess risk, even if the base learner suffers from slow (i.e., polynomial) decay rates. This tail enhancement power of ensembling is agnostic to the underlying base learner and is stronger than variance reduction in the sense of exhibiting rate improvement. We demonstrate how our ensemble methods can substantially improve out-of-sample performances in a range of examples involving heavy-tailed data or intrinsically slow rates. Code for the proposed methods is available at https://github.com/mickeyhqian/VoteEnsemble.

Related papers

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method [53.170053108447455]
Ensemble learning is a method that leverages weak learners to produce a strong learner. We design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative. We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets.
arXiv Detail & Related papers (2024-08-06T03:42:38Z)
The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations. In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy. Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z)
A replica analysis of under-bagging [3.1274367448459253]
Under-bagging (UB) is a popular ensemble learning method for training classifiers on an imbalanced data. Using bagging to reduce the increased variance caused by the reduction in sample size due to under-sampling is a natural approach. It has recently been pointed out that in generalized linear models, naive bagging, which does not consider the class imbalance structure, and ridge regularization can produce the same results.
arXiv Detail & Related papers (2024-04-15T13:31:31Z)
Reviving Undersampling for Long-Tailed Learning [16.054442161144603]
We aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance. We devise a straightforward model ensemble strategy, which does not result in any additional overhead and achieves improved harmonic and geometric mean. We validate the effectiveness of our approach on widely utilized benchmark datasets for long-tailed learning.
arXiv Detail & Related papers (2024-01-30T08:15:13Z)
Toward Understanding Generative Data Augmentation [16.204251285425478]
We show that generative data augmentation can enjoy a faster learning rate when the order of divergence term is $o(maxleft( log(m)beta_m, 1 / sqrtm)right)$. We prove that in both cases, though generative data augmentation does not enjoy a faster learning rate, it can improve the learning guarantees at a constant level when the train set is small.
arXiv Detail & Related papers (2023-05-27T13:46:08Z)
When are ensembles really effective? [49.37269057899679]
We study the question of when ensembling yields significant performance improvements in classification tasks. We show that ensembling improves performance significantly whenever the disagreement rate is large relative to the average error rate. We identify practical scenarios where ensembling does and does not result in large performance improvements.
arXiv Detail & Related papers (2023-05-21T01:36:25Z)
ProBoost: a Boosting Method for Probabilistic Classifiers [55.970609838687864]
ProBoost is a new boosting algorithm for probabilistic classifiers. It uses the uncertainty of each training sample to determine the most challenging/uncertain ones. It produces a sequence that progressively focuses on the samples found to have the highest uncertainty.
arXiv Detail & Related papers (2022-09-04T12:49:20Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Class-Incremental Learning with Strong Pre-trained Models [97.84755144148535]
Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes) We explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes. Our proposed method is robust and generalizes to all analyzed CIL settings.
arXiv Detail & Related papers (2022-04-07T17:58:07Z)
Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss. Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z)
Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs) The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate. We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.