Calibrated Test-Time Guidance for Bayesian Inference
- URL: http://arxiv.org/abs/2602.22428v1
- Date: Wed, 25 Feb 2026 21:38:47 GMT
- Title: Calibrated Test-Time Guidance for Bayesian Inference
- Authors: Daniel Geyfman, Felix Draxler, Jan Groeneveld, Hyunsoo Lee, Theofanis Karaletsos, Stephan Mandt,
- Abstract summary: We show that common test-time guidance methods do not recover the correct posterior distribution and identify the structural approximations responsible for this failure.<n>We then propose consistent alternative estimators that enable sampling from the Bayesian posterior.<n>We significantly outperform previous methods on a set of Bayesian inference tasks, and match state-of-the-art in black hole image reconstruction.
- Score: 25.653139110512914
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Test-time guidance is a widely used mechanism for steering pretrained diffusion models toward outcomes specified by a reward function. Existing approaches, however, focus on maximizing reward rather than sampling from the true Bayesian posterior, leading to miscalibrated inference. In this work, we show that common test-time guidance methods do not recover the correct posterior distribution and identify the structural approximations responsible for this failure. We then propose consistent alternative estimators that enable calibrated sampling from the Bayesian posterior. We significantly outperform previous methods on a set of Bayesian inference tasks, and match state-of-the-art in black hole image reconstruction.
Related papers
- Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators [4.887201041798969]
We introduce Proximal-IMH, a scheme that corrects samples from the approximate posterior through an auxiliary optimization problem.<n>For idealized settings, we prove that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing.<n>The method applies to both linear and nonlinear input-output operators and is particularly suitable for inverse problems where exact posterior sampling is too expensive.
arXiv Detail & Related papers (2026-02-24T22:58:50Z) - Universal priors: solving empirical Bayes via Bayesian inference and pretraining [25.835876583903282]
A transformer pretrained on synthetically generated data achieves strong performance on empirical Bayes (EB) problems.<n>We ask why a pretrained Bayes estimator, trained under a prespecified training distribution, can adapt to arbitrary test distributions.
arXiv Detail & Related papers (2026-02-16T19:29:27Z) - Provable Diffusion Posterior Sampling for Bayesian Inversion [13.807494493914335]
This paper proposes a novel diffusion-based posterior sampling method within a plug-and-play framework.<n>To approximate the posterior score, we develop a Monte Carlo estimator in which particles are generated using Langevin dynamics.<n>On the theoretical side, we provide non-asymptotic error bounds, showing that the method converges even for complex multi-modal target posterior.
arXiv Detail & Related papers (2025-12-08T20:34:05Z) - Divide-and-Conquer Posterior Sampling for Denoising Diffusion Priors [21.0128625037708]
We present an innovative framework, divide-and-conquer posterior sampling.
It reduces the approximation error associated with current techniques without the need for retraining.
We demonstrate the versatility and effectiveness of our approach for a wide range of Bayesian inverse problems.
arXiv Detail & Related papers (2024-03-18T01:47:24Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Variational Prediction [95.00085314353436]
We present a technique for learning a variational approximation to the posterior predictive distribution using a variational bound.
This approach can provide good predictive distributions without test time marginalization costs.
arXiv Detail & Related papers (2023-07-14T18:19:31Z) - Boost Test-Time Performance with Closed-Loop Inference [85.43516360332646]
We propose to predict hard-classified test samples in a looped manner to boost the model performance.
We first devise a filtering criterion to identify those hard-classified test samples that need additional inference loops.
For each hard sample, we construct an additional auxiliary learning task based on its original top-$K$ predictions to calibrate the model.
arXiv Detail & Related papers (2022-03-21T10:20:21Z) - Posterior temperature optimized Bayesian models for inverse problems in
medical imaging [59.82184400837329]
We present an unsupervised Bayesian approach to inverse problems in medical imaging using mean-field variational inference with a fully tempered posterior.
We show that an optimized posterior temperature leads to improved accuracy and uncertainty estimation.
Our source code is publicly available at calibrated.com/Cardio-AI/mfvi-dip-mia.
arXiv Detail & Related papers (2022-02-02T12:16:33Z) - Residual Overfit Method of Exploration [78.07532520582313]
We propose an approximate exploration methodology based on fitting only two point estimates, one tuned and one overfit.
The approach drives exploration towards actions where the overfit model exhibits the most overfitting compared to the tuned model.
We compare ROME against a set of established contextual bandit methods on three datasets and find it to be one of the best performing.
arXiv Detail & Related papers (2021-10-06T17:05:33Z) - Understanding Variational Inference in Function-Space [20.940162027560408]
We highlight some advantages and limitations of employing the Kullback-Leibler divergence in this setting.
We propose (featurized) Bayesian linear regression as a benchmark for function-space' inference methods that directly measures approximation quality.
arXiv Detail & Related papers (2020-11-18T17:42:01Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.