Related papers: Exact Bayesian Inference on Discrete Models via Probability Generating Functions: A Probabilistic Programming Approach

Exact Bayesian Inference on Discrete Models via Probability Generating Functions: A Probabilistic Programming Approach

URL: http://arxiv.org/abs/2305.17058v3
Date: Tue, 7 Nov 2023 01:13:10 GMT
Title: Exact Bayesian Inference on Discrete Models via Probability Generating Functions: A Probabilistic Programming Approach
Authors: Fabian Zaiser, Andrzej S. Murawski, Luke Ong
Abstract summary: We present an exact Bayesian inference method for discrete statistical models. We use a probabilistic programming language that supports discrete and continuous sampling, discrete observations, affine functions, (stochastic) branching, and conditioning on discrete events. Our inference method is provably correct and fully automated.
Score: 7.059472280274009
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an exact Bayesian inference method for discrete statistical models, which can find exact solutions to a large class of discrete inference problems, even with infinite support and continuous priors. To express such models, we introduce a probabilistic programming language that supports discrete and continuous sampling, discrete observations, affine functions, (stochastic) branching, and conditioning on discrete events. Our key tool is probability generating functions: they provide a compact closed-form representation of distributions that are definable by programs, thus enabling the exact computation of posterior probabilities, expectation, variance, and higher moments. Our inference method is provably correct and fully automated in a tool called Genfer, which uses automatic differentiation (specifically, Taylor polynomials), but does not require computer algebra. Our experiments show that Genfer is often faster than the existing exact inference tools PSI, Dice, and Prodigy. On a range of real-world inference problems that none of these exact tools can solve, Genfer's performance is competitive with approximate Monte Carlo methods, while avoiding approximation errors.

Related papers

On the Hardness of Approximating Distributions with Probabilistic Circuits [8.582070926175966]
We show that approximating an arbitrary distribution with bounded $f$-divergence is $mathsfNP$-hard for any model that can tractably compute marginals.<n>We then prove an exponential size gap for approximation between the class of decomposable PCs and additionally deterministic PCs.
arXiv Detail & Related papers (2025-06-02T03:35:07Z)
Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data [2.8377382540923004]
We derive an alternative kernel that can discover and encode both sparsity and nonstationarity. We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods. We also conduct space-time prediction based on more than one million measurements of daily maximum temperature.
arXiv Detail & Related papers (2024-11-07T20:07:21Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
It's All in the Mix: Wasserstein Classification and Regression with Mixed Features [5.106912532044251]
We develop and analyze distributionally robust prediction models that faithfully account for the presence of discrete features. We demonstrate that our models can significantly outperform existing methods that are agnostic to the presence of discrete features.
arXiv Detail & Related papers (2023-12-19T15:15:52Z)
User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems. We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases. We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z)
$\omega$PAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs [64.25762042361839]
$omega$PAP spaces are spaces for reasoning denotationally about expressive differentiable and probabilistic programming languages. Our semantics is general enough to assign meanings to most practical probabilistic and differentiable programs. We establish the almost-everywhere differentiability of probabilistic programs' trace density functions.
arXiv Detail & Related papers (2023-02-21T12:50:05Z)
Testing distributional assumptions of learning algorithms [5.204779946147061]
We study the design of tester-learner pairs $(mathcalA,mathcalT)$. We show that if the distribution on examples in the data passes the tester $mathcalT$ then one can safely trust the output of the agnostic $mathcalA$ on the data.
arXiv Detail & Related papers (2022-04-14T19:10:53Z)
Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z)
Efficient semidefinite-programming-based inference for binary and multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF. We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z)
Tractable Inference in Credal Sentential Decision Diagrams [116.6516175350871]
Probabilistic sentential decision diagrams are logic circuits where the inputs of disjunctive gates are annotated by probability values. We develop the credal sentential decision diagrams, a generalisation of their probabilistic counterpart that allows for replacing the local probabilities with credal sets of mass functions. For a first empirical validation, we consider a simple application based on noisy seven-segment display images.
arXiv Detail & Related papers (2020-08-19T16:04:34Z)
Quadruply Stochastic Gaussian Processes [10.152838128195466]
We introduce a variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, $n$, and the number basis functions used in the kernel approximation, $m$. We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to $m = 107$ basis functions.
arXiv Detail & Related papers (2020-06-04T17:06:25Z)
Stochastically Differentiable Probabilistic Programs [18.971852464650144]
The existence of discrete random variables prohibits many basic gradient-based inference engines. We present a novel approach to run inference efficiently and robustly in such programs using Markov Chain Monte Carlo family of algorithms.
arXiv Detail & Related papers (2020-03-02T08:04:41Z)
$\pi$VAE: a stochastic process prior for Bayesian deep learning with MCMC [2.4792948967354236]
We propose a novel variational autoencoder called the prior encodingal autoencoder ($pi$VAE) We show that our framework can accurately learn expressive function classes such as Gaussian processes, but also properties of functions to enable statistical inference. Perhaps most usefully, we demonstrate that the low dimensional distributed latent space representation learnt provides an elegant and scalable means of performing inference for processes within programming languages such as Stan.
arXiv Detail & Related papers (2020-02-17T10:23:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.