Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
- URL: http://arxiv.org/abs/2407.12996v1
- Date: Wed, 17 Jul 2024 20:31:26 GMT
- Title: Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
- Authors: Haiquan Lu, Xiaotian Liu, Yefan Zhou, Qunli Li, Kurt Keutzer, Michael W. Mahoney, Yujun Yan, Huanrui Yang, Yaoqing Yang,
- Abstract summary: We show the interplay between sharpness and diversity within deep ensembles.
We introduce SharpBalance, a training approach that balances sharpness and diversity within ensembles.
Empirically, we show that SharpBalance not only effectively improves the sharpness-diversity trade-off, but also significantly improves ensemble performance.
- Score: 60.68771286221115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies on deep ensembles have identified the sharpness of the local minima of individual learners and the diversity of the ensemble members as key factors in improving test-time performance. Building on this, our study investigates the interplay between sharpness and diversity within deep ensembles, illustrating their crucial role in robust generalization to both in-distribution (ID) and out-of-distribution (OOD) data. We discover a trade-off between sharpness and diversity: minimizing the sharpness in the loss landscape tends to diminish the diversity of individual members within the ensemble, adversely affecting the ensemble's improvement. The trade-off is justified through our theoretical analysis and verified empirically through extensive experiments. To address the issue of reduced diversity, we introduce SharpBalance, a novel training approach that balances sharpness and diversity within ensembles. Theoretically, we show that our training strategy achieves a better sharpness-diversity trade-off. Empirically, we conducted comprehensive evaluations in various data sets (CIFAR-10, CIFAR-100, TinyImageNet) and showed that SharpBalance not only effectively improves the sharpness-diversity trade-off, but also significantly improves ensemble performance in ID and OOD scenarios.
Related papers
- Out-Of-Distribution Detection with Diversification (Provably) [75.44158116183483]
Out-of-distribution (OOD) detection is crucial for ensuring reliable deployment of machine learning models.
Recent advancements focus on utilizing easily accessible auxiliary outliers (e.g., data from the web or other datasets) in training.
We propose a theoretical guarantee, termed Diversity-induced Mixup for OOD detection (diverseMix), which enhances the diversity of auxiliary outlier set for training.
arXiv Detail & Related papers (2024-11-21T11:56:32Z) - The Curse of Diversity in Ensemble-Based Exploration [7.209197316045156]
Training a diverse ensemble of data-sharing agents can significantly impair the performance of the individual ensemble members.
We name this phenomenon the curse of diversity.
We demonstrate the potential of representation learning to counteract the curse of diversity.
arXiv Detail & Related papers (2024-05-07T14:14:50Z) - Diversity-Aware Agnostic Ensemble of Sharpness Minimizers [24.160975100349376]
We propose DASH - a learning algorithm that promotes diversity and flatness within deep ensembles.
We provide a theoretical backbone for our method along with extensive empirical evidence demonstrating an improvement in ensemble generalizability.
arXiv Detail & Related papers (2024-03-19T23:50:11Z) - Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning.
Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z) - A Unifying Perspective on Multi-Calibration: Game Dynamics for
Multi-Objective Learning [63.20009081099896]
We provide a unifying framework for the design and analysis of multicalibrated predictors.
We exploit connections to game dynamics to achieve state-of-the-art guarantees for a diverse set of multicalibration learning problems.
arXiv Detail & Related papers (2023-02-21T18:24:17Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Improving robustness and calibration in ensembles with diversity
regularization [1.069533806668766]
We introduce a new diversity regularizer for classification tasks that uses out-of-distribution samples.
We show that regularizing diversity can have a significant impact on calibration and robustness, as well as out-of-distribution detection.
arXiv Detail & Related papers (2022-01-26T12:51:11Z) - Neural Network Ensembles: Theory, Training, and the Importance of
Explicit Diversity [6.495473856599276]
Ensemble learning is a process by which multiple base learners are strategically generated and combined into one composite learner.
The right balance of learner accuracy and ensemble diversity can improve the performance of machine learning tasks on benchmark and real-world data sets.
Recent theoretical and practical work has demonstrated the subtle trade-off between accuracy and diversity in an ensemble.
arXiv Detail & Related papers (2021-09-29T00:43:57Z) - DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial
Estimation [109.11580756757611]
Deep ensembles perform better than a single network thanks to the diversity among their members.
Recent approaches regularize predictions to increase diversity; however, they also drastically decrease individual members' performances.
We introduce a novel training criterion called DICE: it increases diversity by reducing spurious correlations among features.
arXiv Detail & Related papers (2021-01-14T10:53:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.