Divergent Ensemble Networks: Enhancing Uncertainty Estimation with Shared Representations and Independent Branching
- URL: http://arxiv.org/abs/2412.01193v3
- Date: Fri, 20 Dec 2024 09:46:11 GMT
- Title: Divergent Ensemble Networks: Enhancing Uncertainty Estimation with Shared Representations and Independent Branching
- Authors: Arnav Kharbanda, Advait Chandorkar,
- Abstract summary: Divergent Ensemble Network (DEN) is a novel architecture that combines shared representation learning with independent branching.
DEN employs a shared input layer to capture common features across all branches, followed by divergent, independently trainable layers that form an ensemble.
This shared-to-branching structure reduces parameter redundancy while maintaining ensemble diversity, enabling efficient and scalable learning.
- Score: 0.9963916732353794
- License:
- Abstract: Ensemble learning has proven effective in improving predictive performance and estimating uncertainty in neural networks. However, conventional ensemble methods often suffer from redundant parameter usage and computational inefficiencies due to entirely independent network training. To address these challenges, we propose the Divergent Ensemble Network (DEN), a novel architecture that combines shared representation learning with independent branching. DEN employs a shared input layer to capture common features across all branches, followed by divergent, independently trainable layers that form an ensemble. This shared-to-branching structure reduces parameter redundancy while maintaining ensemble diversity, enabling efficient and scalable learning.
Related papers
- Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition [5.656581242851759]
Pruning is one of the lightweight network design techniques that operate by removing unnecessary network parts.
In this paper, we devise a novel semi-structured method that discards the downsides of structured and unstructured pruning.
The proposed solution is based on a differentiable cascaded parametrization which combines (i) a band-stop mechanism that prunes weights depending on their magnitudes, (ii) a weight-sharing parametrization that prunes connections either individually or group-wise, and (iii) a gating mechanism which arbitrates between different group-wise and entry-wise pruning.
arXiv Detail & Related papers (2024-12-16T14:29:31Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Diversified Ensemble of Independent Sub-Networks for Robust
Self-Supervised Representation Learning [10.784911682565879]
Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning.
We present a novel self-supervised training regime that leverages an ensemble of independent sub-networks.
Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty.
arXiv Detail & Related papers (2023-08-28T16:58:44Z) - Joint Training of Deep Ensembles Fails Due to Learner Collusion [61.557412796012535]
Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model.
Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of optimizing their joint performance.
We show that directly minimizing the loss of the ensemble appears to rarely be applied in practice.
arXiv Detail & Related papers (2023-01-26T18:58:07Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - On Multi-head Ensemble of Smoothed Classifiers for Certified Robustness [25.021346715099863]
Randomized Smoothing (RS) is a promising technique for certified robustness.
In this work, we augment the network with multiple heads, each of which pertains a classifier for the ensemble.
A novel training strategy, namely Self-PAced Circular-TEaching (SPACTE), is proposed.
arXiv Detail & Related papers (2022-11-20T06:31:53Z) - Multi-task Over-the-Air Federated Learning: A Non-Orthogonal
Transmission Approach [52.85647632037537]
We propose a multi-task over-theair federated learning (MOAFL) framework, where multiple learning tasks share edge devices for data collection and learning models under the coordination of a edge server (ES)
Both the convergence analysis and numerical results demonstrate that the MOAFL framework can significantly reduce the uplink bandwidth consumption of multiple tasks without causing substantial learning performance degradation.
arXiv Detail & Related papers (2021-06-27T13:09:32Z) - An Ensemble with Shared Representations Based on Convolutional Networks
for Continually Learning Facial Expressions [19.72032908764253]
Semi-supervised learning through ensemble predictions is an efficient strategy to leverage the high exposure of unlabelled facial expressions during human-robot interactions.
Traditional ensemble-based systems are composed of several independent classifiers leading to a high degree of redundancy.
We show that our approach is able to continually learn facial expressions through ensemble predictions using unlabelled samples from different data distributions.
arXiv Detail & Related papers (2021-03-05T20:40:52Z) - DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial
Estimation [109.11580756757611]
Deep ensembles perform better than a single network thanks to the diversity among their members.
Recent approaches regularize predictions to increase diversity; however, they also drastically decrease individual members' performances.
We introduce a novel training criterion called DICE: it increases diversity by reducing spurious correlations among features.
arXiv Detail & Related papers (2021-01-14T10:53:26Z) - Neural Ensemble Search for Uncertainty Estimation and Dataset Shift [67.57720300323928]
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift.
We propose two methods for automatically constructing ensembles with emphvarying architectures.
We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
arXiv Detail & Related papers (2020-06-15T17:38:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.