Related papers: Machine Learning and the Future of Bayesian Computation

Machine Learning and the Future of Bayesian Computation

URL: http://arxiv.org/abs/2304.11251v1
Date: Fri, 21 Apr 2023 21:03:01 GMT
Title: Machine Learning and the Future of Bayesian Computation
Authors: Steven Winter, Trevor Campbell, Lizhen Lin, Sanvesh Srivastava, David B. Dunson
Abstract summary: We discuss the potential to improve posterior computation using ideas from machine learning. Concrete future directions are explored in vignettes on normalizing flows, Bayesian coresets, distributed Bayesian inference, and variational inference.
Score: 15.863162558281614
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian models are a powerful tool for studying complex data, allowing the analyst to encode rich hierarchical dependencies and leverage prior information. Most importantly, they facilitate a complete characterization of uncertainty through the posterior distribution. Practical posterior computation is commonly performed via MCMC, which can be computationally infeasible for high dimensional models with many observations. In this article we discuss the potential to improve posterior computation using ideas from machine learning. Concrete future directions are explored in vignettes on normalizing flows, Bayesian coresets, distributed Bayesian inference, and variational inference.

Related papers

Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance. Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws. We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
Scalable Bayesian Learning with posteriors [0.856335408411906]
We introduce posteriors, an easily PyTorch library hosting general-purpose implementations of Bayesian learning. We demonstrate and compare the utility of Bayesian approximations through experiments including an investigation into the cold posterior effect and applications with large language models.
arXiv Detail & Related papers (2024-05-31T18:00:12Z)
Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z)
Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods. We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z)
Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric Learning [1.4293924404819704]
We shed new light on the traditional nearest neighbors algorithm from the perspective of information theory. We propose a robust and interpretable framework for tasks such as classification, regression, density estimation, and anomaly detection using a single model. Our work showcases the architecture's versatility by achieving state-of-the-art results in classification and anomaly detection.
arXiv Detail & Related papers (2023-11-17T00:35:38Z)
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z)
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal. Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z)
Non-Volatile Memory Accelerated Posterior Estimation [3.4256231429537936]
Current machine learning models use only a single learnable parameter combination when making predictions. We show that through the use of high-capacity persistent storage, models whose posterior distribution was too big to approximate are now feasible.
arXiv Detail & Related papers (2022-02-21T20:25:57Z)
$\beta$-Cores: Robust Large-Scale Bayesian Data Summarization in the Presence of Outliers [14.918826474979587]
The quality of classic Bayesian inference depends critically on whether observations conform with the assumed data generating model. We propose a variational inference method that, in a principled way, can simultaneously scale to large datasets. We illustrate the applicability of our approach in diverse simulated and real datasets, and various statistical models.
arXiv Detail & Related papers (2020-08-31T13:47:12Z)
Scaling Bayesian inference of mixed multinomial logit models to very large datasets [9.442139459221785]
We propose an Amortized Variational Inference approach that leverages backpropagation, automatic differentiation and GPU-accelerated computation. We show how normalizing flows can be used to increase the flexibility of the variational posterior approximations.
arXiv Detail & Related papers (2020-04-11T15:30:47Z)
Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization. We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.