Bagging Improves Generalization Exponentially
- URL: http://arxiv.org/abs/2405.14741v2
- Date: Wed, 29 May 2024 05:27:04 GMT
- Title: Bagging Improves Generalization Exponentially
- Authors: Huajie Qian, Donghao Ying, Henry Lam, Wotao Yin,
- Abstract summary: Bagging is a popular ensemble technique to improve the accuracy of machine learning models.
We show how bagging can substantially improve performances in a range of examples involving heavy-tailed data that suffer from intrinsically slow rates.
- Score: 27.941595142117443
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Bagging is a popular ensemble technique to improve the accuracy of machine learning models. It hinges on the well-established rationale that, by repeatedly retraining on resampled data, the aggregated model exhibits lower variance and hence higher stability, especially for discontinuous base learners. In this paper, we provide a new perspective on bagging: By suitably aggregating the base learners at the parametrization instead of the output level, bagging improves generalization performances exponentially, a strength that is significantly more powerful than variance reduction. More precisely, we show that for general stochastic optimization problems that suffer from slowly (i.e., polynomially) decaying generalization errors, bagging can effectively reduce these errors to an exponential decay. Moreover, this power of bagging is agnostic to the solution schemes, including common empirical risk minimization, distributionally robust optimization, and various regularizations. We demonstrate how bagging can substantially improve generalization performances in a range of examples involving heavy-tailed data that suffer from intrinsically slow rates.
Related papers
- Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method [53.170053108447455]
Ensemble learning is a method that leverages weak learners to produce a strong learner.
We design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative.
We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets.
arXiv Detail & Related papers (2024-08-06T03:42:38Z) - The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations.
In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy.
Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z) - A replica analysis of under-bagging [3.1274367448459253]
Under-bagging (UB) is a popular ensemble learning method for training classifiers on an imbalanced data.
Using bagging to reduce the increased variance caused by the reduction in sample size due to under-sampling is a natural approach.
It has recently been pointed out that in generalized linear models, naive bagging, which does not consider the class imbalance structure, and ridge regularization can produce the same results.
arXiv Detail & Related papers (2024-04-15T13:31:31Z) - Reviving Undersampling for Long-Tailed Learning [16.054442161144603]
We aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance.
We devise a straightforward model ensemble strategy, which does not result in any additional overhead and achieves improved harmonic and geometric mean.
We validate the effectiveness of our approach on widely utilized benchmark datasets for long-tailed learning.
arXiv Detail & Related papers (2024-01-30T08:15:13Z) - Toward Understanding Generative Data Augmentation [16.204251285425478]
We show that generative data augmentation can enjoy a faster learning rate when the order of divergence term is $o(maxleft( log(m)beta_m, 1 / sqrtm)right)$.
We prove that in both cases, though generative data augmentation does not enjoy a faster learning rate, it can improve the learning guarantees at a constant level when the train set is small.
arXiv Detail & Related papers (2023-05-27T13:46:08Z) - When are ensembles really effective? [49.37269057899679]
We study the question of when ensembling yields significant performance improvements in classification tasks.
We show that ensembling improves performance significantly whenever the disagreement rate is large relative to the average error rate.
We identify practical scenarios where ensembling does and does not result in large performance improvements.
arXiv Detail & Related papers (2023-05-21T01:36:25Z) - ProBoost: a Boosting Method for Probabilistic Classifiers [55.970609838687864]
ProBoost is a new boosting algorithm for probabilistic classifiers.
It uses the uncertainty of each training sample to determine the most challenging/uncertain ones.
It produces a sequence that progressively focuses on the samples found to have the highest uncertainty.
arXiv Detail & Related papers (2022-09-04T12:49:20Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Class-Incremental Learning with Strong Pre-trained Models [97.84755144148535]
Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes)
We explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes.
Our proposed method is robust and generalizes to all analyzed CIL settings.
arXiv Detail & Related papers (2022-04-07T17:58:07Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs)
The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate.
We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.