Joint Training of Deep Ensembles Fails Due to Learner Collusion
- URL: http://arxiv.org/abs/2301.11323v2
- Date: Tue, 31 Oct 2023 12:01:36 GMT
- Title: Joint Training of Deep Ensembles Fails Due to Learner Collusion
- Authors: Alan Jeffares, Tennison Liu, Jonathan Crabb\'e, Mihaela van der Schaar
- Abstract summary: Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model.
Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of optimizing their joint performance.
We show that directly minimizing the loss of the ensemble appears to rarely be applied in practice.
- Score: 61.557412796012535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensembles of machine learning models have been well established as a powerful
method of improving performance over a single model. Traditionally, ensembling
algorithms train their base learners independently or sequentially with the
goal of optimizing their joint performance. In the case of deep ensembles of
neural networks, we are provided with the opportunity to directly optimize the
true objective: the joint performance of the ensemble as a whole. Surprisingly,
however, directly minimizing the loss of the ensemble appears to rarely be
applied in practice. Instead, most previous research trains individual models
independently with ensembling performed post hoc. In this work, we show that
this is for good reason - joint optimization of ensemble loss results in
degenerate behavior. We approach this problem by decomposing the ensemble
objective into the strength of the base learners and the diversity between
them. We discover that joint optimization results in a phenomenon in which base
learners collude to artificially inflate their apparent diversity. This
pseudo-diversity fails to generalize beyond the training data, causing a larger
generalization gap. We proceed to comprehensively demonstrate the practical
implications of this effect on a range of standard machine learning tasks and
architectures by smoothly interpolating between independent training and joint
optimization.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - The Curse of Diversity in Ensemble-Based Exploration [7.209197316045156]
Training a diverse ensemble of data-sharing agents can significantly impair the performance of the individual ensemble members.
We name this phenomenon the curse of diversity.
We demonstrate the potential of representation learning to counteract the curse of diversity.
arXiv Detail & Related papers (2024-05-07T14:14:50Z) - Diversified Ensemble of Independent Sub-Networks for Robust
Self-Supervised Representation Learning [10.784911682565879]
Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning.
We present a novel self-supervised training regime that leverages an ensemble of independent sub-networks.
Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty.
arXiv Detail & Related papers (2023-08-28T16:58:44Z) - Proof of Swarm Based Ensemble Learning for Federated Learning
Applications [3.2536767864585663]
In federated learning it is not feasible to apply centralised ensemble learning directly due to privacy concerns.
Most distributed consensus algorithms, such as Byzantine fault tolerance (BFT), do not normally perform well in such applications.
We propose PoSw, a novel distributed consensus algorithm for ensemble learning in a federated setting.
arXiv Detail & Related papers (2022-12-28T13:53:34Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z) - Federated Residual Learning [53.77128418049985]
We study a new form of federated learning where the clients train personalized local models and make predictions jointly with the server-side shared model.
Using this new federated learning framework, the complexity of the central shared model can be minimized while still gaining all the performance benefits that joint training provides.
arXiv Detail & Related papers (2020-03-28T19:55:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.