Jump-Diffusion Langevin Dynamics for Multimodal Posterior Sampling
- URL: http://arxiv.org/abs/2211.01774v1
- Date: Wed, 2 Nov 2022 17:35:04 GMT
- Title: Jump-Diffusion Langevin Dynamics for Multimodal Posterior Sampling
- Authors: Jacopo Guidolin, Vyacheslav Kungurtsev, Ond\v{r}ej Ku\v{z}elka
- Abstract summary: We investigate the performance of a hybrid Metropolis and Langevin sampling method akin to Jump Diffusion on a range of synthetic and real data.
We find that careful calibration of mixing sampling jumps with gradient based chains significantly outperforms both pure gradient-based or sampling based schemes.
- Score: 3.4483987421251516
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian methods of sampling from a posterior distribution are becoming
increasingly popular due to their ability to precisely display the uncertainty
of a model fit. Classical methods based on iterative random sampling and
posterior evaluation such as Metropolis-Hastings are known to have desirable
long run mixing properties, however are slow to converge. Gradient based
methods, such as Langevin Dynamics (and its stochastic gradient counterpart)
exhibit favorable dimension-dependence and fast mixing times for log-concave,
and "close" to log-concave distributions, however also have long escape times
from local minimizers. Many contemporary applications such as Bayesian Neural
Networks are both high-dimensional and highly multimodal. In this paper we
investigate the performance of a hybrid Metropolis and Langevin sampling method
akin to Jump Diffusion on a range of synthetic and real data, indicating that
careful calibration of mixing sampling jumps with gradient based chains
significantly outperforms both pure gradient-based or sampling based schemes.
Related papers
- The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models [9.392691963008385]
Langevin Monte Carlo (LMC) is the simplest and most studied algorithm.
We propose the Poisson Midpoint Method, which approximates a small step-size LMC with large step-sizes.
We show that it maintains the quality of DDPM with 1000 neural network calls with just 50-80 neural network calls and outperforms ODE based methods with similar compute.
arXiv Detail & Related papers (2024-05-27T11:40:42Z) - Differentiable and Stable Long-Range Tracking of Multiple Posterior Modes [1.534667887016089]
We leverage training data to discriminatively learn particle-based representations of uncertainty in latent object states.
Our approach achieves dramatic improvements in accuracy, while also showing much greater stability across multiple training runs.
arXiv Detail & Related papers (2024-04-12T19:33:52Z) - Diffusive Gibbs Sampling [40.1197715949575]
We propose Diffusive Gibbs Sampling (DiGS) for effective sampling from distributions characterized by distant and disconnected modes.
DiGS integrates recent developments in diffusion models, leveraging Gaussian convolution to create an auxiliary noisy distribution.
A novel Metropolis-within-Gibbs scheme is proposed to enhance mixing in the denoising sampling step.
arXiv Detail & Related papers (2024-02-05T13:47:41Z) - Stable generative modeling using Schrödinger bridges [0.22499166814992438]
We propose a generative model combining Schr"odinger bridges and Langevin dynamics.
Our framework can be naturally extended to generate conditional samples and to Bayesian inference problems.
arXiv Detail & Related papers (2024-01-09T06:15:45Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Multi-scale Diffusion Denoised Smoothing [79.95360025953931]
randomized smoothing has become one of a few tangible approaches that offers adversarial robustness to models at scale.
We present scalable methods to address the current trade-off between certified robustness and accuracy in denoised smoothing.
Our experiments show that the proposed multi-scale smoothing scheme combined with diffusion fine-tuning enables strong certified robustness available with high noise level.
arXiv Detail & Related papers (2023-10-25T17:11:21Z) - DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for
Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space.
We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process.
Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - Resolving the Mixing Time of the Langevin Algorithm to its Stationary
Distribution for Log-Concave Sampling [34.66940399825547]
This paper characterizes the mixing time of the Langevin Algorithm to its stationary distribution.
We introduce a technique from the differential privacy literature to the sampling literature.
arXiv Detail & Related papers (2022-10-16T05:11:16Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.