Related papers: Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows

Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows

URL: http://arxiv.org/abs/2406.17263v1
Date: Tue, 25 Jun 2024 04:07:22 GMT
Title: Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows
Authors: Yifan Chen, Daniel Zhengyu Huang, Jiaoyang Huang, Sebastian Reich, Andrew M. Stuart,
Abstract summary: We study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications.
Score: 10.153270126742369
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward models; (ii) the potential existence of multiple modes; and (iii) the fact that gradient of, or adjoint solver for, the forward model might not be feasible. While existing Bayesian inference methods meet some of these challenges individually, we propose a framework that tackles all three systematically. Our approach builds upon the Fisher-Rao gradient flow in probability space, yielding a dynamical system for probability densities that converges towards the target distribution at a uniform exponential rate. This rapid convergence is advantageous for the computational burden outlined in (i). We apply Gaussian mixture approximations with operator splitting techniques to simulate the flow numerically; the resulting approximation can capture multiple modes thus addressing (ii). Furthermore, we employ the Kalman methodology to facilitate a derivative-free update of these Gaussian components and their respective weights, addressing the issue in (iii). The proposed methodology results in an efficient derivative-free sampler flexible enough to handle multi-modal distributions: Gaussian Mixture Kalman Inversion (GMKI). The effectiveness of GMKI is demonstrated both theoretically and numerically in several experiments with multimodal target distributions, including proof-of-concept and two-dimensional examples, as well as a large-scale application: recovering the Navier-Stokes initial condition from solution data at positive times.

Related papers

Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts [64.34482582690927]
We provide an efficient and principled method for sampling from a sequence of annealed, geometric-averaged, or product distributions derived from pretrained score-based models. We propose Sequential Monte Carlo (SMC) resampling algorithms that leverage inference-time scaling to improve sampling quality.
arXiv Detail & Related papers (2025-03-04T17:46:51Z)
Stable Derivative Free Gaussian Mixture Variational Inference for Bayesian Inverse Problems [4.842853252452336]
Key challenges include costly repeated evaluations of forward models, multimodality, and inaccessible gradients for the forward model. We develop a variational inference framework that combines Fisher-Rao natural gradient with specialized quadrature rules to enable derivative free updates of Gaussian mixture variational families. The resulting method, termed Derivative Free Gaussian Mixture Variational Inference (DF-GMVI), guarantees covariance positivity and affine invariance, offering a stable and efficient framework for approximating complex posterior distributions.
arXiv Detail & Related papers (2025-01-08T03:50:15Z)
Go With the Flow: Fast Diffusion for Gaussian Mixture Models [16.07896640031724]
Schrodinger Bridges (SBs) are diffusion processes that steer in finite time, a given initial distribution to another final one while minimizing a suitable cost functional.<n>We propose an analytic parametrization of a set of feasible policies for solving low dimensional problems.<n>We showcase the potential of this approach in low-to-image problems such as image-to-image translation in the latent space of an autoencoder, learning of cellular dynamics using multi-marginal momentum SB problems and various other examples.
arXiv Detail & Related papers (2024-12-12T08:40:22Z)
A Stein Gradient Descent Approach for Doubly Intractable Distributions [5.63014864822787]
We propose a novel Monte Carlo Stein variational gradient descent (MC-SVGD) approach for inference for doubly intractable distributions. The proposed method achieves substantial computational gains over existing algorithms, while providing comparable inferential performance for the posterior distributions.
arXiv Detail & Related papers (2024-10-28T13:42:27Z)
Annealed Stein Variational Gradient Descent for Improved Uncertainty Estimation in Full-Waveform Inversion [25.714206592953545]
Variational Inference (VI) provides an approximate solution to the posterior distribution in the form of a parametric or non-parametric proposal distribution. This study aims to improve the performance of VI within the context of Full-Waveform Inversion.
arXiv Detail & Related papers (2024-10-17T06:15:26Z)
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems. We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z)
Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models. We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models. Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z)
Robust scalable initialization for Bayesian variational inference with multi-modal Laplace approximations [0.0]
Variational mixtures with full-covariance structures suffer from a quadratic growth due to variational parameters with the number of parameters. We propose a method for constructing an initial Gaussian model approximation that can be used to warm-start variational inference.
arXiv Detail & Related papers (2023-07-12T19:30:04Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of Multimodal Posteriors [8.11978827493967]
We propose an approach using parallel runs of MCMC, variational, or mode-based inference to hit as many modes as possible. We present theoretical consistency with an example where the stacked inference process approximates the true data. We demonstrate practical implementation in several model families.
arXiv Detail & Related papers (2020-06-22T15:26:59Z)
Mean-Field Approximation to Gaussian-Softmax Integral with Application to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks. We use a mean-field approximation formula to compute an analytically intractable integral. Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
Gaussianization Flows [113.79542218282282]
We propose a new type of normalizing flow model that enables both efficient iteration of likelihoods and efficient inversion for sample generation. Because of this guaranteed expressivity, they can capture multimodal target distributions without compromising the efficiency of sample generation.
arXiv Detail & Related papers (2020-03-04T08:15:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.