Error analysis of a compositional score-based algorithm for simulation-based inference
- URL: http://arxiv.org/abs/2510.15817v1
- Date: Fri, 17 Oct 2025 16:56:25 GMT
- Title: Error analysis of a compositional score-based algorithm for simulation-based inference
- Authors: Camille Touron, Gabriel V. Cardoso, Julyan Arbel, Pedro L. C. Rodrigues,
- Abstract summary: We study the compositional score produced by the GAUSS algorithm of Linhart et al.<n>2024 and establish an upper bound on its mean squared error in terms of both the individual score errors and the number of observations.
- Score: 1.0689604144545297
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Simulation-based inference (SBI) has become a widely used framework in applied sciences for estimating the parameters of stochastic models that best explain experimental observations. A central question in this setting is how to effectively combine multiple observations in order to improve parameter inference and obtain sharper posterior distributions. Recent advances in score-based diffusion methods address this problem by constructing a compositional score, obtained by aggregating individual posterior scores within the diffusion process. While it is natural to suspect that the accumulation of individual errors may significantly degrade sampling quality as the number of observations grows, this important theoretical issue has so far remained unexplored. In this paper, we study the compositional score produced by the GAUSS algorithm of Linhart et al. (2024) and establish an upper bound on its mean squared error in terms of both the individual score errors and the number of observations. We illustrate our theoretical findings on a Gaussian example, where all analytical expressions can be derived in a closed form.
Related papers
- Score-based diffusion models for diffuse optical tomography with uncertainty quantification [0.8443238959374133]
We introduce a novel regularization approach that prevents overfitting of the score function by constructing a mixed score composed of a learned and a model-based component.<n>Experiments demonstrate that a data-driven prior distribution results in posterior samples with low variance, compared to classical model-based estimation.
arXiv Detail & Related papers (2026-02-03T12:14:07Z) - Overspecified Mixture Discriminant Analysis: Exponential Convergence, Statistical Guarantees, and Remote Sensing Applications [2.124297073085513]
This study explores the classification error of Mixture Discriminant Analysis (MDA) in scenarios where the number of mixture components exceeds those present in the actual data distribution.<n>We analyze both the algorithmic convergence of the Expectation-Maximization (EM) algorithm and the statistical classification error.
arXiv Detail & Related papers (2025-10-30T23:56:56Z) - The Effect of Stochasticity in Score-Based Diffusion Sampling: a KL Divergence Analysis [0.0]
We study the effect of divergenceity on the generation process through bounds on the Kullback-Leibler (KL)<n>Our main results apply to linear forward SDEs with additive noise and Lipschitz-continuous score functions.
arXiv Detail & Related papers (2025-06-13T01:01:07Z) - In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.<n>Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z) - Bayesian Federated Inference for regression models based on non-shared multicenter data sets from heterogeneous populations [0.0]
In a regression model, the sample size must be large enough relative to the number of possible predictors.
Pooling data from different data sets collected in different (medical) centers would alleviate this problem, but is often not feasible due to privacy regulation or logistic problems.
An alternative route would be to analyze the local data in the centers separately and combine the statistical inference results with the Bayesian Federated Inference (BFI) methodology.
The aim of this approach is to compute from the inference results in separate centers what would have been found if the statistical analysis was performed on the combined data.
arXiv Detail & Related papers (2024-02-05T11:10:27Z) - Sample Complexity Bounds for Score-Matching: Causal Discovery and
Generative Modeling [82.36856860383291]
We demonstrate that accurate estimation of the score function is achievable by training a standard deep ReLU neural network.
We establish bounds on the error rate of recovering causal relationships using the score-matching-based causal discovery method.
arXiv Detail & Related papers (2023-10-27T13:09:56Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Compositional Score Modeling for Simulation-based Inference [28.422049267537965]
We introduce a new method based on conditional score modeling that enjoys the benefits of both approaches.
Our approach is sample-efficient, can naturally aggregate multiple observations at inference time, and avoids the drawbacks of standard inference methods.
arXiv Detail & Related papers (2022-09-28T17:08:31Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - A Unified View of Stochastic Hamiltonian Sampling [18.300078015845262]
This work revisits the theoretical properties of Hamiltonian differential equations (SDEs) for posterior sampling.
We study the two types of errors that arise from numerical SDE simulation: the discretization error and the error due to noisy gradient estimates.
arXiv Detail & Related papers (2021-06-30T16:50:11Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.