FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear
Modulation
- URL: http://arxiv.org/abs/2206.00050v1
- Date: Tue, 31 May 2022 18:33:15 GMT
- Title: FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear
Modulation
- Authors: Mehmet Ozgur Turkoglu, Alexander Becker, H\"useyin Anil G\"und\"uz,
Mina Rezaei, Bernd Bischl, Rodrigo Caye Daudt, Stefano D'Aronco, Jan Dirk
Wegner, Konrad Schindler
- Abstract summary: We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation.
By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity.
We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
- Score: 69.34011200590817
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to estimate epistemic uncertainty is often crucial when deploying
machine learning in the real world, but modern methods often produce
overconfident, uncalibrated uncertainty predictions. A common approach to
quantify epistemic uncertainty, usable across a wide class of prediction
models, is to train a model ensemble. In a naive implementation, the ensemble
approach has high computational cost and high memory demand. This challenges in
particular modern deep learning, where even a single deep network is already
demanding in terms of compute and memory, and has given rise to a number of
attempts to emulate the model ensemble without actually instantiating separate
ensemble members. We introduce FiLM-Ensemble, a deep, implicit ensemble method
based on the concept of Feature-wise Linear Modulation (FiLM). That technique
was originally developed for multi-task learning, with the aim of decoupling
different tasks. We show that the idea can be extended to uncertainty
quantification: by modulating the network activations of a single deep network
with FiLM, one obtains a model ensemble with high diversity, and consequently
well-calibrated estimates of epistemic uncertainty, with low computational
overhead in comparison. Empirically, FiLM-Ensemble outperforms other implicit
ensemble methods, and it and comes very close to the upper bound of an explicit
ensemble of networks (sometimes even beating it), at a fraction of the memory
cost.
Related papers
- Amortized Bayesian Multilevel Models [9.831471158899644]
Multilevel models (MLMs) are a central building block of the Bayesian workflow.
MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints.
Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks.
We explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets.
arXiv Detail & Related papers (2024-08-23T17:11:04Z) - Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Diversified Ensemble of Independent Sub-Networks for Robust
Self-Supervised Representation Learning [10.784911682565879]
Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning.
We present a novel self-supervised training regime that leverages an ensemble of independent sub-networks.
Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty.
arXiv Detail & Related papers (2023-08-28T16:58:44Z) - Dynamic Mixed Membership Stochastic Block Model for Weighted Labeled
Networks [3.5450828190071655]
A new family of Mixed Membership Block Models (MMSBM) allows to model static labeled networks under the assumption of mixed-membership clustering.
We show that our method significantly differs from existing approaches, and allows to model more complex systems --dynamic labeled networks.
arXiv Detail & Related papers (2023-04-12T15:01:03Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Sequential Bayesian Neural Subnetwork Ensembles [4.6354120722975125]
We propose an approach for sequential ensembling of dynamic Bayesian neuralworks that consistently maintains reduced model complexity throughout the training process.
Our proposed approach outperforms traditional dense and sparse deterministic and Bayesian ensemble models in terms of prediction accuracy, uncertainty estimation, out-of-distribution detection, and adversarial robustness.
arXiv Detail & Related papers (2022-06-01T22:57:52Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.