Beyond Bayesian Model Averaging over Paths in Probabilistic Programs with Stochastic Support
- URL: http://arxiv.org/abs/2310.14888v2
- Date: Fri, 12 Apr 2024 14:36:18 GMT
- Title: Beyond Bayesian Model Averaging over Paths in Probabilistic Programs with Stochastic Support
- Authors: Tim Reichelt, Luke Ong, Tom Rainforth,
- Abstract summary: We show that making predictions with this full posterior implicitly performs a Bayesian model averaging (BMA) over paths.
We propose alternative mechanisms for path weighting: one based on stacking and one based on ideas from PAC-Bayes.
- Score: 20.53123189114551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The posterior in probabilistic programs with stochastic support decomposes as a weighted sum of the local posterior distributions associated with each possible program path. We show that making predictions with this full posterior implicitly performs a Bayesian model averaging (BMA) over paths. This is potentially problematic, as BMA weights can be unstable due to model misspecification or inference approximations, leading to sub-optimal predictions in turn. To remedy this issue, we propose alternative mechanisms for path weighting: one based on stacking and one based on ideas from PAC-Bayes. We show how both can be implemented as a cheap post-processing step on top of existing inference engines. In our experiments, we find them to be more robust and lead to better predictions compared to the default BMA weights.
Related papers
- Divide-and-Conquer Posterior Sampling for Denoising Diffusion Priors [21.0128625037708]
We present an innovative framework, divide-and-conquer posterior sampling.
It reduces the approximation error associated with current techniques without the need for retraining.
We demonstrate the versatility and effectiveness of our approach for a wide range of Bayesian inverse problems.
arXiv Detail & Related papers (2024-03-18T01:47:24Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Variational Prediction [95.00085314353436]
We present a technique for learning a variational approximation to the posterior predictive distribution using a variational bound.
This approach can provide good predictive distributions without test time marginalization costs.
arXiv Detail & Related papers (2023-07-14T18:19:31Z) - Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming [12.249274845167415]
We show the equivalence of the Gibbs distribution to a message-passing algorithm by the properties of the Gumbel distribution.
We propose the BDP-VAE which captures structured sparse optimal paths as latent variables.
arXiv Detail & Related papers (2023-06-05T03:47:59Z) - Fast post-process Bayesian inference with Variational Sparse Bayesian Quadrature [13.36200518068162]
We propose the framework of post-process Bayesian inference as a means to obtain a quick posterior approximation from existing target density evaluations.
Within this framework, we introduce Variational Sparse Bayesian Quadrature (VSBQ), a method for post-process approximate inference for models with black-box and potentially noisy likelihoods.
We validate our method on challenging synthetic scenarios and real-world applications from computational neuroscience.
arXiv Detail & Related papers (2023-03-09T13:58:35Z) - Sample-Efficient Optimisation with Probabilistic Transformer Surrogates [66.98962321504085]
This paper investigates the feasibility of employing state-of-the-art probabilistic transformers in Bayesian optimisation.
We observe two drawbacks stemming from their training procedure and loss definition, hindering their direct deployment as proxies in black-box optimisation.
We introduce two components: 1) a BO-tailored training prior supporting non-uniformly distributed points, and 2) a novel approximate posterior regulariser trading-off accuracy and input sensitivity to filter favourable stationary points for improved predictive performance.
arXiv Detail & Related papers (2022-05-27T11:13:17Z) - Non-Probability Sampling Network for Stochastic Human Trajectory
Prediction [16.676008193894223]
Capturing multimodal natures is essential for incorporating pedestrian trajectory prediction.
We introduce the Quasi-Carlo method, ensuring uniform coverage on the sampling space, as an alternative to the conventional random sampling.
We take an additional step ahead by a learnable sampling network into the existing networks for trajectory prediction.
arXiv Detail & Related papers (2022-03-25T06:41:47Z) - Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic
Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees.
We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z) - Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of
Multimodal Posteriors [8.11978827493967]
We propose an approach using parallel runs of MCMC, variational, or mode-based inference to hit as many modes as possible.
We present theoretical consistency with an example where the stacked inference process approximates the true data.
We demonstrate practical implementation in several model families.
arXiv Detail & Related papers (2020-06-22T15:26:59Z) - Likelihood-Free Inference with Deep Gaussian Processes [70.74203794847344]
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations.
We propose a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions.
Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases.
arXiv Detail & Related papers (2020-06-18T14:24:05Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.