Sequential Kalman Monte Carlo for gradient-free inference in Bayesian inverse problems
- URL: http://arxiv.org/abs/2407.07781v1
- Date: Wed, 10 Jul 2024 15:56:30 GMT
- Title: Sequential Kalman Monte Carlo for gradient-free inference in Bayesian inverse problems
- Authors: Richard D. P. Grumitt, Minas Karamanis, Uroš Seljak,
- Abstract summary: We introduce Sequential Kalman Monte Carlo samplers to perform gradient-free inference in inverse problems.
FAKI employs normalizing flows to relax the Gaussian ansatz of the target measures in EKI.
FAKI alone is not able to correct for the model linearity assumptions in EKI.
- Score: 1.3654846342364308
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensemble Kalman Inversion (EKI) has been proposed as an efficient method for solving inverse problems with expensive forward models. However, the method is based on the assumption that we proceed through a sequence of Gaussian measures in moving from the prior to the posterior, and that the forward model is linear. In this work, we introduce Sequential Kalman Monte Carlo (SKMC) samplers, where we exploit EKI and Flow Annealed Kalman Inversion (FAKI) within a Sequential Monte Carlo (SMC) sampling scheme to perform efficient gradient-free inference in Bayesian inverse problems. FAKI employs normalizing flows (NF) to relax the Gaussian ansatz of the target measures in EKI. NFs are able to learn invertible maps between a Gaussian latent space and the original data space, allowing us to perform EKI updates in the Gaussianized NF latent space. However, FAKI alone is not able to correct for the model linearity assumptions in EKI. Errors in the particle distribution as we move through the sequence of target measures can therefore compound to give incorrect posterior moment estimates. In this work we consider the use of EKI and FAKI to initialize the particle distribution for each target in an adaptive SMC annealing scheme, before performing t-preconditioned Crank-Nicolson (tpCN) updates to distribute particles according to the target. We demonstrate the performance of these SKMC samplers on three challenging numerical benchmarks, showing significant improvements in the rate of convergence compared to standard SMC with importance weighted resampling at each temperature level. Code implementing the SKMC samplers is available at https://github.com/RichardGrumitt/KalmanMC.
Related papers
- Sequential Monte Carlo for Inclusive KL Minimization in Amortized Variational Inference [3.126959812401426]
We propose SMC-Wake, a procedure for fitting an amortized variational approximation that uses sequential Monte Carlo samplers to estimate the gradient of the inclusive KL divergence.
In experiments with both simulated and real datasets, SMC-Wake fits variational distributions that approximate the posterior more accurately than existing methods.
arXiv Detail & Related papers (2024-03-15T18:13:48Z) - Curvature-Informed SGD via General Purpose Lie-Group Preconditioners [6.760212042305871]
We present a novel approach to accelerate gradient descent (SGD) by utilizing curvature information.
Our approach involves two preconditioners: a matrix-free preconditioner and a low-rank approximation preconditioner.
We demonstrate that Preconditioned SGD (PSGD) outperforms SoTA on Vision, NLP, and RL tasks.
arXiv Detail & Related papers (2024-02-07T03:18:00Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Flow Annealed Kalman Inversion for Gradient-Free Inference in Bayesian
Inverse Problems [1.534667887016089]
Flow Annealed Kalman Inversion (FAKI) is a generalization of Ensemble Kalman Inversion (EKI)
We demonstrate the performance of FAKI on two numerical benchmarks.
arXiv Detail & Related papers (2023-09-20T17:39:14Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior.
We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Bayesian Sparse learning with preconditioned stochastic gradient MCMC
and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions.
We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z) - Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via
Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy.
EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method.
In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.