Related papers: How To Train Your Program

How To Train Your Program

URL: http://arxiv.org/abs/2105.03650v1
Date: Sat, 8 May 2021 09:26:34 GMT
Title: How To Train Your Program
Authors: David Tolpin
Abstract summary: We present a Bayesian approach to machine learning with probabilistic programs. We frame the approach as a design pattern of probabilistic programming referred to herein as stump and fungus'
Score: 0.11421942894219898
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a Bayesian approach to machine learning with probabilistic programs. In our approach, training on available data is implemented as inference on a hierarchical model. The posterior distribution of model parameters is then used to \textit{stochastically condition} a complementary model, such that inference on new data yields the same posterior distribution of latent parameters corresponding to the new data as inference on a hierachical model on the combination of both previously available and new data, at a lower computation cost. We frame the approach as a design pattern of probabilistic programming referred to herein as `stump and fungus', and illustrate realization of the pattern on a didactic case study.

Related papers

Generative Modeling with Bayesian Sample Inference [50.07758840675341]
We derive a novel generative model from the simple act of Gaussian posterior inference. Treating the generated sample as an unknown variable to infer lets us formulate the sampling process in the language of Bayesian probability. Our model uses a sequence of prediction and posterior update steps to narrow down the unknown sample from a broad initial belief.
arXiv Detail & Related papers (2025-02-11T14:27:10Z)
Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. Existing approaches require re-training models on different data subsets, which is computationally intensive. This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z)
Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
Bayesian Neural Network Inference via Implicit Models and the Posterior Predictive Distribution [0.8122270502556371]
We propose a novel approach to perform approximate Bayesian inference in complex models such as Bayesian neural networks. The approach is more scalable to large data than Markov Chain Monte Carlo. We see this being useful in applications such as surrogate and physics-based models.
arXiv Detail & Related papers (2022-09-06T02:43:19Z)
Data-Driven Sample Average Approximation with Covariate Information [0.0]
We study optimization for data-driven decision-making when we have observations of the uncertain parameters within the optimization model together with concurrent observations of coparametrics. We investigate three data-driven frameworks that integrate a machine learning prediction model within a programming sample average approximation (SAA) for approximating the solution to this problem.
arXiv Detail & Related papers (2022-07-27T14:45:04Z)
Optimizing model-agnostic Random Subspace ensembles [5.680512932725364]
We present a model-agnostic ensemble approach for supervised learning. The proposed approach alternates between learning an ensemble of models using a parametric version of the Random Subspace approach. We show the good performance of the proposed approach, both in terms of prediction and feature ranking, on simulated and real-world datasets.
arXiv Detail & Related papers (2021-09-07T13:58:23Z)
MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio. We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z)
Bayesian Imaging With Data-Driven Priors Encoded by Neural Networks: Theory, Methods, and Algorithms [2.266704469122763]
This paper proposes a new methodology for performing Bayesian inference in imaging inverse problems where the prior knowledge is available in the form of training data. We establish the existence and well-posedness of the associated posterior moments under easily verifiable conditions. A model accuracy analysis suggests that the Bayesian probability probabilities reported by the data-driven models are also remarkably accurate under a frequentist definition.
arXiv Detail & Related papers (2021-03-18T11:34:08Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.