Simulation-based Bayesian inference with ameliorative learned summary statistics -- Part I
- URL: http://arxiv.org/abs/2601.22441v1
- Date: Fri, 30 Jan 2026 01:21:11 GMT
- Title: Simulation-based Bayesian inference with ameliorative learned summary statistics -- Part I
- Authors: Getachew K. Befekadu,
- Abstract summary: This paper considers a simulation-based inference with learned summary statistics.<n>The exact likelihood function associated with the observation data and the simulation model is difficult to obtain in a closed form or computationally intractable.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper, which is Part 1 of a two-part paper series, considers a simulation-based inference with learned summary statistics, in which such a learned summary statistic serves as an empirical-likelihood with ameliorative effects in the Bayesian setting, when the exact likelihood function associated with the observation data and the simulation model is difficult to obtain in a closed form or computationally intractable. In particular, a transformation technique which leverages the Cressie-Read discrepancy criterion under moment restrictions is used for summarizing the learned statistics between the observation data and the simulation outputs, while preserving the statistical power of the inference. Here, such a transformation of data-to-learned summary statistics also allows the simulation outputs to be conditioned on the observation data, so that the inference task can be performed over certain sample sets of the observation data that are considered as an empirical relevance or believed to be particular importance. Moreover, the simulation-based inference framework discussed in this paper can be extended further, and thus handling weakly dependent observation data. Finally, we remark that such an inference framework is suitable for implementation in distributed computing, i.e., computational tasks involving both the data-to-learned summary statistics and the Bayesian inferencing problem can be posed as a unified distributed inference problem that will exploit distributed optimization and MCMC algorithms for supporting large datasets associated with complex simulation models.
Related papers
- Flow Matching for Robust Simulation-Based Inference under Model Misspecification [11.172752919335394]
Flow Matching Corrected Posterior Estimation is a framework that refines simulation-trained posterior estimators using a small set of real calibration samples.<n>We show that our proposal consistently mitigates the effects of misspecification, delivering improved inference accuracy and uncertainty calibration compared to standard SBI baselines.
arXiv Detail & Related papers (2025-09-27T16:10:53Z) - ConDiSim: Conditional Diffusion Models for Simulation Based Inference [2.1493648495606354]
ConDiSim is a conditional diffusion model for simulation-based inference of complex systems with intractable likelihoods.<n>It is evaluated across ten benchmark problems and two real-world test problems, where it demonstrates effective posterior approximation accuracy.
arXiv Detail & Related papers (2025-05-13T09:58:23Z) - Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.<n>The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.<n>The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood [9.13755431537592]
This study tackles learning and inference using only summarized data within the framework of Gaussian process regression.<n>We introduce the concept of sample quasi-likelihood, which facilitates learning and inference using only summarized data.
arXiv Detail & Related papers (2024-12-23T10:21:38Z) - Statistical inference for case-control logistic regression via integrating external summary data [8.369377566749202]
Case-control sampling is a commonly used retrospective sampling design to alleviate imbalanced structure of binary data.
An empirical likelihood based approach is proposed to make inference for the logistic model by incorporating the internal case-control data and external information.
arXiv Detail & Related papers (2024-05-31T07:47:38Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Sequential Likelihood-Free Inference with Implicit Surrogate Proposal [24.20924279100816]
This paper introduces Implicit Surrogate Proposal (ISP) to generate a cumulated dataset with further sample efficiency.
ISP constructs the cumulative dataset in the most diverse way by drawing i.i.d samples via a feed-forward fashion.
We demonstrate that ISP outperforms the baseline inference algorithms on simulations with multi-modal posteriors.
arXiv Detail & Related papers (2020-10-15T08:59:23Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.