On Independent Samples Along the Langevin Diffusion and the Unadjusted
Langevin Algorithm
- URL: http://arxiv.org/abs/2402.17067v1
- Date: Mon, 26 Feb 2024 23:05:02 GMT
- Title: On Independent Samples Along the Langevin Diffusion and the Unadjusted
Langevin Algorithm
- Authors: Jiaming Liang, Siddharth Mitra, Andre Wibisono
- Abstract summary: We study the rate at which the initial and current random variables become independent along a Markov chain.
We focus on the Langevin diffusion in continuous time and the Unadjusted Langevin Algorithm (ULA) in discrete time.
- Score: 18.595570786973948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the rate at which the initial and current random variables become
independent along a Markov chain, focusing on the Langevin diffusion in
continuous time and the Unadjusted Langevin Algorithm (ULA) in discrete time.
We measure the dependence between random variables via their mutual
information. For the Langevin diffusion, we show the mutual information
converges to $0$ exponentially fast when the target is strongly log-concave,
and at a polynomial rate when the target is weakly log-concave. These rates are
analogous to the mixing time of the Langevin diffusion under similar
assumptions. For the ULA, we show the mutual information converges to $0$
exponentially fast when the target is strongly log-concave and smooth. We prove
our results by developing the mutual version of the mixing time analyses of
these Markov chains. We also provide alternative proofs based on strong data
processing inequalities for the Langevin diffusion and the ULA, and by showing
regularity results for these processes in mutual information.
Related papers
- Efficiently learning and sampling multimodal distributions with data-based initialization [20.575122468674536]
We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from the stationary measure.
We show that if the Markov chain has a $k$th order spectral gap, samples from the stationary distribution will efficiently generate a sample whose conditional law is $varepsilon$-close in TV distance to the stationary measure.
arXiv Detail & Related papers (2024-11-14T01:37:02Z) - Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler [14.34147140416535]
We study the mixing time of two popular discrete-time Markov chains in continuous space.
We show that any $Phi$-divergence arising from a twice-differentiable strictly convex function $Phi$ converges to $0$ exponentially fast along these Markov chains.
arXiv Detail & Related papers (2024-10-14T16:41:45Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - An Improved Analysis of Langevin Algorithms with Prior Diffusion for
Non-Log-Concave Sampling [27.882407333690267]
We show that modified Langevin algorithm with prior diffusion can converge dimension independently for strongly log-concave target distributions.
We also prove that the modified Langevin algorithm can also obtain the dimension-independent convergence of KL divergence with different step size schedules.
arXiv Detail & Related papers (2024-03-10T11:50:34Z) - Time Series Diffusion in the Frequency Domain [54.60573052311487]
We analyze whether representing time series in the frequency domain is a useful inductive bias for score-based diffusion models.
We show that a dual diffusion process occurs in the frequency domain with an important nuance.
We show how to adapt the denoising score matching approach to implement diffusion models in the frequency domain.
arXiv Detail & Related papers (2024-02-08T18:59:05Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting [64.0722630873758]
We consider rather general and broad class of Markov chains, Ito chains, that look like Euler-Maryama discretization of some Differential Equation.
We prove the bound in $W_2$-distance between the laws of our Ito chain and differential equation.
arXiv Detail & Related papers (2023-10-09T18:38:56Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Subsampling Error in Stochastic Gradient Langevin Diffusions [0.7900303955225063]
The Gradient Langevin Dynamics (SGLD) are popularly used to approximate Bayesian posterior distributions in statistical learning procedures with large-scale data.
The first error is introduced by an Euler--Maruyama discretisation of a Langevin diffusion process, the second error comes from the data subsampling that enables its use in large-scale data settings.
We show the exponential ergodicity of SLGDiff and that the Wasserstein distance between the posterior and the limiting distribution of SGLDiff is bounded above by a fractional power of the mean waiting time.
arXiv Detail & Related papers (2023-05-23T10:03:40Z) - Enabling Quantum Speedup of Markov Chains using a Multi-level Approach [0.0]
Quantum speedup for mixing a Markov chain can be achieved based on the construction of slowly-varying $r$ Markov chains.
We show that the density function of a low-resolution Markov chain can be used to warm start the Markov chain with high resolution.
arXiv Detail & Related papers (2022-10-25T15:17:52Z) - Resolving the Mixing Time of the Langevin Algorithm to its Stationary
Distribution for Log-Concave Sampling [34.66940399825547]
This paper characterizes the mixing time of the Langevin Algorithm to its stationary distribution.
We introduce a technique from the differential privacy literature to the sampling literature.
arXiv Detail & Related papers (2022-10-16T05:11:16Z) - Score-Based Diffusion meets Annealed Importance Sampling [89.92133671626327]
Annealed Importance Sampling remains one of the most effective methods for marginal likelihood estimation.
We leverage recent progress in score-based generative modeling to approximate the optimal extended target distribution for AIS proposals.
arXiv Detail & Related papers (2022-08-16T12:13:29Z) - A rapidly mixing Markov chain from any gapped quantum many-body system [2.321323878201932]
We consider a bit string $x$ from a $pi(x)=|langle x|psirangle|2$, where $psi$ is the unique ground state of a local Hamiltonian $H$.
Our main result describes a direct link between the inverse spectral gap of $H$ and the mixing time of an associated continuous-time Markov Chain with steady state $pi$.
arXiv Detail & Related papers (2022-07-14T16:38:42Z) - Diffusion-GAN: Training GANs with Diffusion [135.24433011977874]
Generative adversarial networks (GANs) are challenging to train stably.
We propose Diffusion-GAN, a novel GAN framework that leverages a forward diffusion chain to generate instance noise.
We show that Diffusion-GAN can produce more realistic images with higher stability and data efficiency than state-of-the-art GANs.
arXiv Detail & Related papers (2022-06-05T20:45:01Z) - Time-inhomogeneous diffusion geometry and topology [69.55228523791897]
Diffusion condensation is a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data.
We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives.
Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.
arXiv Detail & Related papers (2022-03-28T16:06:17Z) - Spread of Correlations in Strongly Disordered Lattice Systems with
Long-Range Coupling [0.0]
We investigate the spread of correlations carried by an excitation in a 1-dimensional lattice system with high on-site energy disorder.
The increase in correlation between the initially quenched node and a given node exhibits three phases: quadratic in time, linear in time, and saturation.
arXiv Detail & Related papers (2021-06-15T15:47:20Z) - A Matrix Chernoff Bound for Markov Chains and Its Application to
Co-occurrence Matrices [30.49132891333353]
We prove a Chernoff-type bound for sums of matrix-valued random variables sampled via a regular (aperiodic and irreducible) finite Markov chain.
Our proof is based on the matrix expander (regular undirected graph) Chernoff bound [Garg et al. STOC '18] and scalar Chernoff-Hoeffding bounds for Markov chains.
Our matrix Chernoff bound for Markov chains can be applied to analyze the behavior of co-occurrence statistics for sequential data.
arXiv Detail & Related papers (2020-08-06T05:44:54Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Exponential ergodicity of mirror-Langevin diffusions [16.012656579770827]
We propose a class of diffusions called Newton-Langevin diffusions and prove that they converge to stationarity exponentially fast.
We give an application to the problem of sampling from the uniform distribution on a convex body using a strategy inspired by interior-point methods.
arXiv Detail & Related papers (2020-05-19T18:00:52Z) - Targeted stochastic gradient Markov chain Monte Carlo for hidden Markov models with rare latent states [48.705095800341944]
Markov chain Monte Carlo (MCMC) algorithms for hidden Markov models often rely on the forward-backward sampler.
This makes them computationally slow as the length of the time series increases, motivating the development of sub-sampling-based approaches.
We propose a targeted sub-sampling approach that over-samples observations corresponding to rare latent states when calculating the gradient of parameters associated with them.
arXiv Detail & Related papers (2018-10-31T17:44:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.