Related papers: Diffusion posterior sampling for simulation-based inference in tall data settings

Diffusion posterior sampling for simulation-based inference in tall data settings

URL: http://arxiv.org/abs/2404.07593v2
Date: Fri, 7 Jun 2024 14:07:50 GMT
Title: Diffusion posterior sampling for simulation-based inference in tall data settings
Authors: Julia Linhart, Gabriel Victorino Cardoso, Alexandre Gramfort, Sylvain Le Corff, Pedro L. C. Rodrigues,
Abstract summary: Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
Score: 53.17563688225137
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Determining which parameters of a non-linear model best describe a set of experimental data is a fundamental problem in science and it has gained much traction lately with the rise of complex large-scale simulators. The likelihood of such models is typically intractable, which is why classical MCMC methods can not be used. Simulation-based inference (SBI) stands out in this context by only requiring a dataset of simulations to train deep generative models capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. The proposed method is built upon recent developments from the flourishing score-based diffusion literature and allows to estimate the tall data posterior distribution, while simply using information from a score network trained for a single context observation. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.

Related papers

Multimodal Atmospheric Super-Resolution With Deep Generative Models [0.0]
Score-based diffusion modeling is a generative machine learning algorithm that can be used to sample from complex distributions.<n>In this article, we apply such a concept to the super-resolution of a high-dimensional dynamical system, given the real-time availability of low-resolution and experimentally observed sparse sensor measurements.
arXiv Detail & Related papers (2025-06-28T06:47:09Z)
Multi-fidelity Parameter Estimation Using Conditional Diffusion Models [6.934199382834925]
We present a multi-fidelity method for uncertainty quantification of parameter estimates in complex systems. We use conditional generative models trained to sample the target conditional distribution. We demonstrate the effectiveness of the proposed method on several numerical examples.
arXiv Detail & Related papers (2025-04-02T16:54:47Z)
Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal. Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z)
Learning Coupled Subspaces for Multi-Condition Spike Data [8.114880112033644]
In neuroscience, researchers typically conduct experiments under multiple conditions to acquire neural responses in the form of high-dimensional spike train datasets. We propose a non-parametric Bayesian approach to learn a smooth tuning function over the experiment condition space.
arXiv Detail & Related papers (2024-10-24T20:44:28Z)
Embed and Emulate: Contrastive representations for simulation-based inference [11.543221890134399]
This paper introduces Embed and Emulate (E&E), a new simulation-based inference ( SBI) method based on contrastive learning. E&E learns a low-dimensional latent embedding of the data and a corresponding fast emulator in the latent space. We demonstrate superior performance over existing methods in a realistic, non-identifiable parameter estimation task.
arXiv Detail & Related papers (2024-09-27T02:37:01Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Simulation-based inference using surjective sequential neural likelihood estimation [50.24983453990065]
Surjective Sequential Neural Likelihood estimation is a novel method for simulation-based inference. By embedding the data in a low-dimensional space, SSNL solves several issues previous likelihood-based methods had when applied to high-dimensional data sets.
arXiv Detail & Related papers (2023-08-02T10:02:38Z)
Nonparametric likelihood-free inference with Jensen-Shannon divergence for simulator-based models with categorical output [1.4298334143083322]
Likelihood-free inference for simulator-based statistical models has attracted a surge of interest, both in the machine learning and statistics communities. Here we derive a set of theoretical results to enable estimation, hypothesis testing and construction of confidence intervals for model parameters using computation properties of the Jensen-Shannon- divergence. Such approximation offers a rapid alternative to more-intensive approaches and can be attractive for diverse applications of simulator-based models.
arXiv Detail & Related papers (2022-05-22T18:00:13Z)
Mixed Effects Neural ODE: A Variational Approximation for Analyzing the Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data. We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem. We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z)
Approximate Bayesian Computation with Path Signatures [0.5156484100374059]
We introduce the use of path signatures as a natural candidate feature set for constructing distances between time series data. Our experiments show that such an approach can generate more accurate approximate Bayesian posteriors than existing techniques for time series models.
arXiv Detail & Related papers (2021-06-23T17:25:43Z)
MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio. We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z)
Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method. A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations. We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z)
Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset. To estimate the model parameters, an Alternating Least Squares algorithm is developed. The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.