A Unified View of Stochastic Hamiltonian Sampling
- URL: http://arxiv.org/abs/2106.16200v1
- Date: Wed, 30 Jun 2021 16:50:11 GMT
- Title: A Unified View of Stochastic Hamiltonian Sampling
- Authors: Giulio Franzese, Dimitrios Milios, Maurizio Filippone, Pietro
Michiardi
- Abstract summary: This work revisits the theoretical properties of Hamiltonian differential equations (SDEs) for posterior sampling.
We study the two types of errors that arise from numerical SDE simulation: the discretization error and the error due to noisy gradient estimates.
- Score: 18.300078015845262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we revisit the theoretical properties of Hamiltonian stochastic
differential equations (SDEs) for Bayesian posterior sampling, and we study the
two types of errors that arise from numerical SDE simulation: the
discretization error and the error due to noisy gradient estimates in the
context of data subsampling. We consider overlooked results describing the
ergodic convergence rates of numerical integration schemes, and we produce a
novel analysis for the effect of mini-batches through the lens of differential
operator splitting. In our analysis, the stochastic component of the proposed
Hamiltonian SDE is decoupled from the gradient noise, for which we make no
normality assumptions. This allows us to derive interesting connections among
different sampling schemes, including the original Hamiltonian Monte Carlo
(HMC) algorithm, and explain their performance. We show that for a careful
selection of numerical integrators, both errors vanish at a rate
$\mathcal{O}(\eta^2)$, where $\eta$ is the integrator step size. Our
theoretical results are supported by an empirical study on a variety of
regression and classification tasks for Bayesian neural networks.
Related papers
- HJ-sampler: A Bayesian sampler for inverse problems of a stochastic process by leveraging Hamilton-Jacobi PDEs and score-based generative models [1.949927790632678]
This paper builds on the log transform known as the Cole-Hopf transform in Brownian motion contexts.
We develop a new algorithm, named the HJ-sampler, for inference for the inverse problem of a differential equation with given terminal observations.
arXiv Detail & Related papers (2024-09-15T05:30:54Z) - Conditional score-based diffusion models for solving inverse problems in mechanics [6.319616423658121]
We propose a framework to perform Bayesian inference using conditional score-based diffusion models.
Conditional score-based diffusion models are generative models that learn to approximate the score function of a conditional distribution.
We demonstrate the efficacy of the proposed approach on a suite of high-dimensional inverse problems in mechanics.
arXiv Detail & Related papers (2024-06-19T02:09:15Z) - Noise in the reverse process improves the approximation capabilities of
diffusion models [27.65800389807353]
In Score based Generative Modeling (SGMs), the state-of-the-art in generative modeling, reverse processes are known to perform better than their deterministic counterparts.
This paper delves into the heart of this phenomenon, comparing neural ordinary differential equations (ODEs) and neural dimension equations (SDEs) as reverse processes.
We analyze the ability of neural SDEs to approximate trajectories of the Fokker-Planck equation, revealing the advantages of neurality.
arXiv Detail & Related papers (2023-12-13T02:39:10Z) - Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models.
Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z) - Noise-Free Sampling Algorithms via Regularized Wasserstein Proximals [3.4240632942024685]
We consider the problem of sampling from a distribution governed by a potential function.
This work proposes an explicit score based MCMC method that is deterministic, resulting in a deterministic evolution for particles.
arXiv Detail & Related papers (2023-08-28T23:51:33Z) - Towards Convergence Rates for Parameter Estimation in Gaussian-gated
Mixture of Experts [40.24720443257405]
We provide a convergence analysis for maximum likelihood estimation (MLE) in the Gaussian-gated MoE model.
Our findings reveal that the MLE has distinct behaviors under two complement settings of location parameters of the Gaussian gating functions.
Notably, these behaviors can be characterized by the solvability of two different systems of equations.
arXiv Detail & Related papers (2023-05-12T16:02:19Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Generative Modeling with Denoising Auto-Encoders and Langevin Sampling [88.83704353627554]
We show that both DAE and DSM provide estimates of the score of the smoothed population density.
We then apply our results to the homotopy method of arXiv:1907.05600 and provide theoretical justification for its empirical success.
arXiv Detail & Related papers (2020-01-31T23:50:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.