Ensembling over Classifiers: a Bias-Variance Perspective
- URL: http://arxiv.org/abs/2206.10566v1
- Date: Tue, 21 Jun 2022 17:46:35 GMT
- Title: Ensembling over Classifiers: a Bias-Variance Perspective
- Authors: Neha Gupta, Jamie Smith, Ben Adlam, Zelda Mariet
- Abstract summary: We build upon the extension to the bias-variance decomposition by Pfau (2013) in order to gain crucial insights into the behavior of ensembles of classifiers.
We show that conditional estimates necessarily incur an irreducible error.
Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction.
- Score: 13.006468721874372
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Ensembles are a straightforward, remarkably effective method for improving
the accuracy,calibration, and robustness of models on classification tasks;
yet, the reasons that underlie their success remain an active area of research.
We build upon the extension to the bias-variance decomposition by Pfau (2013)
in order to gain crucial insights into the behavior of ensembles of
classifiers. Introducing a dual reparameterization of the bias-variance
tradeoff, we first derive generalized laws of total expectation and variance
for nonsymmetric losses typical of classification tasks. Comparing conditional
and bootstrap bias/variance estimates, we then show that conditional estimates
necessarily incur an irreducible error. Next, we show that ensembling in dual
space reduces the variance and leaves the bias unchanged, whereas standard
ensembling can arbitrarily affect the bias. Empirically, standard ensembling
reducesthe bias, leading us to hypothesize that ensembles of classifiers may
perform well in part because of this unexpected reduction.We conclude by an
empirical analysis of recent deep learning methods that ensemble over
hyperparameters, revealing that these techniques indeed favor bias reduction.
This suggests that, contrary to classical wisdom, targeting bias reduction may
be a promising direction for classifier ensembles.
Related papers
- Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random [2.8165314121189247]
In most practical applications such as recommendation systems, display advertising, and so forth, the collected data often contains missing values.
We develop a systematic fine-grained dynamic learning framework to jointly optimize bias and variance.
arXiv Detail & Related papers (2024-05-24T10:07:09Z) - Distributionally Robust Optimization and Invariant Representation
Learning for Addressing Subgroup Underrepresentation: Mechanisms and
Limitations [10.4894578909708]
Spurious correlation caused by subgroup underrepresentation has received increasing attention as a source of bias that can be perpetuated by DNNs.
We take the first step to better understand and improve the mechanisms for debiasing spurious correlation due to subgroup underrepresentation in medical image classification.
arXiv Detail & Related papers (2023-08-12T01:55:58Z) - Malign Overfitting: Interpolation Can Provably Preclude Invariance [30.776243638012314]
We show that "benign overfitting" in which models generalize well despite interpolating might not favorably extend to settings in which robustness or fairness are desirable.
We propose and analyze an algorithm that successfully learns a non-interpolating classifier that is provably invariant.
arXiv Detail & Related papers (2022-11-28T19:17:31Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Selective Regression Under Fairness Criteria [30.672082160544996]
In some cases, the performance of minority group can decrease while we reduce the coverage.
We show that such an unwanted behavior can be avoided if we can construct features satisfying the sufficiency criterion.
arXiv Detail & Related papers (2021-10-28T19:05:12Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Interpolation can hurt robust generalization even when there is no noise [76.3492338989419]
We show that avoiding generalization through ridge regularization can significantly improve generalization even in the absence of noise.
We prove this phenomenon for the robust risk of both linear regression and classification and hence provide the first theoretical result on robust overfitting.
arXiv Detail & Related papers (2021-08-05T23:04:15Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Debiasing classifiers: is reality at variance with expectation? [9.730485257882433]
We show that debiasers often fail in practice to generalize out-of-sample data, and can in fact make fairness worse rather than better.
Considering fairness--performance trade-offs justifies the counterintuitive notion that partial debiasing can actually yield better results in practice on out-of-sample data.
arXiv Detail & Related papers (2020-11-04T17:00:54Z) - Consistency Regularization for Certified Robustness of Smoothed
Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness.
We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.