Diversified Ensemble of Independent Sub-Networks for Robust
Self-Supervised Representation Learning
- URL: http://arxiv.org/abs/2308.14705v2
- Date: Fri, 1 Sep 2023 11:38:56 GMT
- Title: Diversified Ensemble of Independent Sub-Networks for Robust
Self-Supervised Representation Learning
- Authors: Amirhossein Vahidi, Lisa Wimmer, H\"useyin Anil G\"und\"uz, Bernd
Bischl, Eyke H\"ullermeier, Mina Rezaei
- Abstract summary: Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning.
We present a novel self-supervised training regime that leverages an ensemble of independent sub-networks.
Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty.
- Score: 10.784911682565879
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensembling a neural network is a widely recognized approach to enhance model
performance, estimate uncertainty, and improve robustness in deep supervised
learning. However, deep ensembles often come with high computational costs and
memory demands. In addition, the efficiency of a deep ensemble is related to
diversity among the ensemble members which is challenging for large,
over-parameterized deep neural networks. Moreover, ensemble learning has not
yet seen such widespread adoption, and it remains a challenging endeavor for
self-supervised or unsupervised representation learning. Motivated by these
challenges, we present a novel self-supervised training regime that leverages
an ensemble of independent sub-networks, complemented by a new loss function
designed to encourage diversity. Our method efficiently builds a sub-model
ensemble with high diversity, leading to well-calibrated estimates of model
uncertainty, all achieved with minimal computational overhead compared to
traditional deep self-supervised ensembles. To evaluate the effectiveness of
our approach, we conducted extensive experiments across various tasks,
including in-distribution generalization, out-of-distribution detection,
dataset corruption, and semi-supervised settings. The results demonstrate that
our method significantly improves prediction reliability. Our approach not only
achieves excellent accuracy but also enhances calibration, surpassing baseline
performance across a wide range of self-supervised architectures in computer
vision, natural language processing, and genomics data.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Towards Improving Robustness Against Common Corruptions using Mixture of
Class Specific Experts [10.27974860479791]
This paper introduces a novel paradigm known as the Mixture of Class-Specific Expert Architecture.
The proposed architecture aims to mitigate vulnerabilities associated with common neural network structures.
arXiv Detail & Related papers (2023-11-16T20:09:47Z) - Regularization Through Simultaneous Learning: A Case Study on Plant
Classification [0.0]
This paper introduces Simultaneous Learning, a regularization approach drawing on principles of Transfer Learning and Multi-task Learning.
We leverage auxiliary datasets with the target dataset, the UFOP-HVD, to facilitate simultaneous classification guided by a customized loss function.
Remarkably, our approach demonstrates superior performance over models without regularization.
arXiv Detail & Related papers (2023-05-22T19:44:57Z) - Joint Training of Deep Ensembles Fails Due to Learner Collusion [61.557412796012535]
Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model.
Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of optimizing their joint performance.
We show that directly minimizing the loss of the ensemble appears to rarely be applied in practice.
arXiv Detail & Related papers (2023-01-26T18:58:07Z) - HCE: Improving Performance and Efficiency with Heterogeneously
Compressed Neural Network Ensemble [22.065904428696353]
Recent ensemble training method explores different training algorithms or settings on multiple sub-models with the same model architecture.
We propose Heterogeneously Compressed Ensemble (HCE), where we build an efficient ensemble with the pruned and quantized variants from a pretrained DNN model.
arXiv Detail & Related papers (2023-01-18T21:47:05Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear
Modulation [69.34011200590817]
We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation.
By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity.
We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
arXiv Detail & Related papers (2022-05-31T18:33:15Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Efficient Facial Feature Learning with Wide Ensemble-based Convolutional
Neural Networks [20.09586211332088]
We present experiments on Ensembles with Shared Representations based on convolutional networks.
We show that redundancy and computational load can be dramatically reduced by varying the branching level of the ESR.
Experiments on large-scale datasets suggest that ESRs reduce the remaining residual generalization error.
arXiv Detail & Related papers (2020-01-17T14:32:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.