Amortized Bayesian Multilevel Models
- URL: http://arxiv.org/abs/2408.13230v1
- Date: Fri, 23 Aug 2024 17:11:04 GMT
- Title: Amortized Bayesian Multilevel Models
- Authors: Daniel Habermann, Marvin Schmitt, Lars Kühmichel, Andreas Bulling, Stefan T. Radev, Paul-Christian Bürkner,
- Abstract summary: Multilevel models (MLMs) are a central building block of the Bayesian workflow.
MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints.
Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks.
We explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets.
- Score: 9.831471158899644
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Multilevel models (MLMs) are a central building block of the Bayesian workflow. They enable joint, interpretable modeling of data across hierarchical levels and provide a fully probabilistic quantification of uncertainty. Despite their well-recognized advantages, MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints. Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks. However, the utility and reliability of deep learning methods for estimating Bayesian MLMs remains largely unexplored, especially when compared with gold-standard samplers. To this end, we explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets. We test our method on several real-world case studies and provide comprehensive comparisons to Stan as a gold-standard method where possible. Finally, we provide an open-source implementation of our methods to stimulate further research in the nascent field of amortized Bayesian inference.
Related papers
- The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others [1.654278807602897]
This study introduces Misclassification Likelihood Matrix (MLM) as a novel tool for quantifying the reliability of neural network predictions under distribution shifts.
The implications of this work extend beyond image classification, with ongoing applications in autonomous systems, such as self-driving cars.
arXiv Detail & Related papers (2024-07-10T16:43:14Z) - On Least Square Estimation in Softmax Gating Mixture of Experts [78.3687645289918]
We investigate the performance of the least squares estimators (LSE) under a deterministic MoE model.
We establish a condition called strong identifiability to characterize the convergence behavior of various types of expert functions.
Our findings have important practical implications for expert selection.
arXiv Detail & Related papers (2024-02-05T12:31:18Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Amortised Inference in Bayesian Neural Networks [0.0]
We introduce the Amortised Pseudo-Observation Variational Inference Bayesian Neural Network (APOVI-BNN)
We show that the amortised inference is of similar or better quality to those obtained through traditional variational inference.
We then discuss how the APOVI-BNN may be viewed as a new member of the neural process family.
arXiv Detail & Related papers (2023-09-06T14:02:33Z) - Beyond Deep Ensembles: A Large-Scale Evaluation of Bayesian Deep
Learning under Distribution Shift [19.945634052291542]
We evaluate modern BDL algorithms on real-world datasets from the WILDS collection containing challenging classification and regression tasks.
We compare the algorithms on a wide range of large, convolutional and transformer-based neural network architectures.
We provide the first systematic evaluation of BDL for fine-tuning large pre-trained models.
arXiv Detail & Related papers (2023-06-21T14:36:03Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen
Neural Networks [50.15201777970128]
We propose BayesCap that learns a Bayesian identity mapping for the frozen model, allowing uncertainty estimation.
BayesCap is a memory-efficient method that can be trained on a small fraction of the original dataset.
We show the efficacy of our method on a wide variety of tasks with a diverse set of architectures.
arXiv Detail & Related papers (2022-07-14T12:50:09Z) - FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear
Modulation [69.34011200590817]
We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation.
By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity.
We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
arXiv Detail & Related papers (2022-05-31T18:33:15Z) - Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for
Bayesian Deep Learning [15.521736934292354]
We investigate the potential of sparse network structures to flexibly trade-off storage costs and inference run time.
We show that certain classes of randomly selected substructures can perform as well as substructures derived from state-of-the-art iterative pruning methods.
arXiv Detail & Related papers (2022-02-08T10:34:05Z) - Amortized Bayesian Inference for Models of Cognition [0.1529342790344802]
Recent advances in simulation-based inference using specialized neural network architectures circumvent many previous problems of approximate Bayesian computation.
We provide a general introduction to amortized Bayesian parameter estimation and model comparison.
arXiv Detail & Related papers (2020-05-08T08:12:15Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.