Foundation Posteriors for Approximate Probabilistic Inference
- URL: http://arxiv.org/abs/2205.09735v1
- Date: Thu, 19 May 2022 17:42:37 GMT
- Title: Foundation Posteriors for Approximate Probabilistic Inference
- Authors: Mike Wu, Noah Goodman
- Abstract summary: We formulate inference as masked language modeling in a probabilistic program.
We train a neural network to unmask the random values, defining an approximate posterior distribution.
We show the efficacy of the approach, zero-shot and fine-tuned, on a benchmark of STAN programs.
- Score: 11.64841553345271
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probabilistic programs provide an expressive representation language for
generative models. Given a probabilistic program, we are interested in the task
of posterior inference: estimating a latent variable given a set of observed
variables. Existing techniques for inference in probabilistic programs often
require choosing many hyper-parameters, are computationally expensive, and/or
only work for restricted classes of programs. Here we formulate inference as
masked language modeling: given a program, we generate a supervised dataset of
variables and assignments, and randomly mask a subset of the assignments. We
then train a neural network to unmask the random values, defining an
approximate posterior distribution. By optimizing a single neural network
across a range of programs we amortize the cost of training, yielding a
``foundation'' posterior able to do zero-shot inference for new programs. The
foundation posterior can also be fine-tuned for a particular program and
dataset by optimizing a variational inference objective. We show the efficacy
of the approach, zero-shot and fine-tuned, on a benchmark of STAN programs.
Related papers
- Efficient Incremental Belief Updates Using Weighted Virtual Observations [2.7195102129095003]
We present an algorithmic solution to the problem of incremental belief updating in the context of Monte Carlo inference.
We implement and apply the solution to a number of didactic examples and case studies, showing efficiency and robustness of our approach.
arXiv Detail & Related papers (2024-02-10T12:48:49Z) - Scalable Neural-Probabilistic Answer Set Programming [18.136093815001423]
We introduce SLASH, a novel DPPL that consists of Neural-Probabilistic Predicates (NPPs) and a logic program, united via answer set programming (ASP)
We show how to prune the insignificantally insignificant parts of the (ground) program, speeding up reasoning without sacrificing the predictive performance.
We evaluate SLASH on a variety of different tasks, including the benchmark task of MNIST addition and Visual Question Answering (VQA)
arXiv Detail & Related papers (2023-06-14T09:45:29Z) - $\omega$PAP Spaces: Reasoning Denotationally About Higher-Order,
Recursive Probabilistic and Differentiable Programs [64.25762042361839]
$omega$PAP spaces are spaces for reasoning denotationally about expressive differentiable and probabilistic programming languages.
Our semantics is general enough to assign meanings to most practical probabilistic and differentiable programs.
We establish the almost-everywhere differentiability of probabilistic programs' trace density functions.
arXiv Detail & Related papers (2023-02-21T12:50:05Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - flip-hoisting: Exploiting Repeated Parameters in Discrete Probabilistic
Programs [25.320181572646135]
We present a program analysis and associated optimization, flip-hoisting, that collapses repetitious parameters in discrete probabilistic programs to improve inference performance.
We implement flip-hoisting in an existing probabilistic programming language and show empirically that it significantly improves inference performance.
arXiv Detail & Related papers (2021-10-19T22:04:26Z) - Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic
Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees.
We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z) - Meta-Learning an Inference Algorithm for Probabilistic Programs [13.528656805820459]
We present a meta-algorithm for learning a posterior-inference algorithm for restricted probabilistic programs.
Key feature of our approach is the use of a white-box inference algorithm that extracts information directly from model descriptions.
arXiv Detail & Related papers (2021-03-01T04:05:11Z) - Can We Learn Heuristics For Graphical Model Inference Using
Reinforcement Learning? [114.24881214319048]
We show that we can learn programs, i.e., policies, for solving inference in higher order Conditional Random Fields (CRFs) using reinforcement learning.
Our method solves inference tasks efficiently without imposing any constraints on the form of the potentials.
arXiv Detail & Related papers (2020-04-27T19:24:04Z) - Stochastically Differentiable Probabilistic Programs [18.971852464650144]
The existence of discrete random variables prohibits many basic gradient-based inference engines.
We present a novel approach to run inference efficiently and robustly in such programs using Markov Chain Monte Carlo family of algorithms.
arXiv Detail & Related papers (2020-03-02T08:04:41Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.