Are Ensembles Getting Better all the Time?
- URL: http://arxiv.org/abs/2311.17885v2
- Date: Wed, 20 Mar 2024 13:36:23 GMT
- Title: Are Ensembles Getting Better all the Time?
- Authors: Pierre-Alexandre Mattei, Damien Garreau,
- Abstract summary: We show that ensembles are getting better all the time if, and only if, the considered loss function is convex.
We illustrate our results on a medical prediction (diagnosing melanomas using neural nets) and a "wisdom crowds" experiment (guessing the ratings of upcoming movies)
- Score: 24.442955229765957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensemble methods combine the predictions of several base models. We study whether or not including more models always improves their average performance. This question depends on the kind of ensemble considered, as well as the predictive metric chosen. We focus on situations where all members of the ensemble are a priori expected to perform as well, which is the case of several popular methods such as random forests or deep ensembles. In this setting, we show that ensembles are getting better all the time if, and only if, the considered loss function is convex. More precisely, in that case, the average loss of the ensemble is a decreasing function of the number of models. When the loss function is nonconvex, we show a series of results that can be summarised as: ensembles of good models keep getting better, and ensembles of bad models keep getting worse. To this end, we prove a new result on the monotonicity of tail probabilities that may be of independent interest. We illustrate our results on a medical prediction problem (diagnosing melanomas using neural nets) and a "wisdom of crowds" experiment (guessing the ratings of upcoming movies).
Related papers
- Prediction Instability in Machine Learning Ensembles [0.0]
We prove a theorem that shows that any ensemble will exhibit at least one of the following forms of prediction instability.
It will either ignore agreement among all underlying models, change its mind when none of the underlying models have done so, or be manipulable through inclusion or exclusion of options it would never actually predict.
This analysis also sheds light on what specific forms of prediction instability to expect from particular ensemble algorithms.
arXiv Detail & Related papers (2024-07-03T15:26:02Z) - Confidence-Based Model Selection: When to Take Shortcuts for
Subpopulation Shifts [119.22672589020394]
We propose COnfidence-baSed MOdel Selection (CosMoS), where model confidence can effectively guide model selection.
We evaluate CosMoS on four datasets with spurious correlations, each with multiple test sets with varying levels of data distribution shift.
arXiv Detail & Related papers (2023-06-19T18:48:15Z) - When are ensembles really effective? [49.37269057899679]
We study the question of when ensembling yields significant performance improvements in classification tasks.
We show that ensembling improves performance significantly whenever the disagreement rate is large relative to the average error rate.
We identify practical scenarios where ensembling does and does not result in large performance improvements.
arXiv Detail & Related papers (2023-05-21T01:36:25Z) - Sharing pattern submodels for prediction with missing values [12.981974894538668]
Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time.
We propose an alternative approach, called sharing pattern submodels, which i) makes predictions robust to missing values at test time, ii) maintains or improves the predictive power of pattern submodels andiii) has a short description, enabling improved interpretability.
arXiv Detail & Related papers (2022-06-22T15:09:40Z) - Functional Ensemble Distillation [18.34081591772928]
We investigate how to best distill an ensemble's predictions using an efficient model.
We find that learning the distilled model via a simple augmentation scheme in the form of mixup augmentation significantly boosts the performance.
arXiv Detail & Related papers (2022-06-05T14:07:17Z) - Uncertainty Estimation for Language Reward Models [5.33024001730262]
Language models can learn a range of capabilities from unsupervised training on text corpora.
It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons.
We seek to address these problems via uncertainty estimation, which can improve sample efficiency and robustness using active learning and risk-averse reinforcement learning.
arXiv Detail & Related papers (2022-03-14T20:13:21Z) - Causal Inference Under Unmeasured Confounding With Negative Controls: A
Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available.
Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.