Discrete Langevin Sampler via Wasserstein Gradient Flow
- URL: http://arxiv.org/abs/2206.14897v1
- Date: Wed, 29 Jun 2022 20:33:54 GMT
- Title: Discrete Langevin Sampler via Wasserstein Gradient Flow
- Authors: Haoran Sun, Hanjun Dai, Bo Dai, Haomin Zhou, Dale Schuurmans
- Abstract summary: We show how LB functions give rise to LB dynamics corresponding to Wasserstein gradient flow in a discrete space.
We propose a new algorithm, the Locally Balanced Jump (LBJ), by discretizing the LB dynamics with respect to simulation time.
- Score: 102.94731267405056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, a family of locally balanced (LB) samplers has demonstrated
excellent performance at sampling and learning energy-based models (EBMs) in
discrete spaces. However, the theoretical understanding of this success is
limited. In this work, we show how LB functions give rise to LB dynamics
corresponding to Wasserstein gradient flow in a discrete space. From first
principles, previous LB samplers can then be seen as discretizations of the LB
dynamics with respect to Hamming distance. Based on this observation, we
propose a new algorithm, the Locally Balanced Jump (LBJ), by discretizing the
LB dynamics with respect to simulation time. As a result, LBJ has a
location-dependent "velocity" that allows it to make proposals with larger
distances. Additionally, LBJ decouples each dimension into independent
sub-processes, enabling convenient parallel implementation. We demonstrate the
advantages of LBJ for sampling and learning in various binary and categorical
distributions.
Related papers
- Scalable DP-SGD: Shuffling vs. Poisson Subsampling [61.19794019914523]
We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Linear Queries (ABLQ) mechanism with shuffled batch sampling.
We show substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch.
We introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation.
arXiv Detail & Related papers (2024-11-06T19:06:16Z) - Density estimation with LLMs: a geometric investigation of in-context learning trajectories [3.281128493853064]
Large language models (LLMs) demonstrate remarkable emergent abilities to perform in-context learning across various tasks.
This work investigates LLMs' ability to estimate probability density functions from data observed in-context.
We leverage the Intensive Principal Component Analysis (InPCA) to visualize and analyze the in-context learning dynamics of LLaMA-2 models.
arXiv Detail & Related papers (2024-10-07T17:22:56Z) - Hyperbolic Fine-tuning for Large Language Models [56.54715487997674]
This study investigates the non-Euclidean characteristics of large language models (LLMs)
We show that token embeddings exhibit a high degree of hyperbolicity, indicating a latent tree-like structure in the embedding space.
We introduce a new method called hyperbolic low-rank efficient fine-tuning, HypLoRA, that performs low-rank adaptation directly on the hyperbolic manifold.
arXiv Detail & Related papers (2024-10-05T02:58:25Z) - Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review [63.31328039424469]
This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions.
We explain the application of various RL algorithms, including PPO, differentiable optimization, reward-weighted MLE, value-weighted sampling, and path consistency learning.
arXiv Detail & Related papers (2024-07-18T17:35:32Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - The Adjoint Is All You Need: Characterizing Barren Plateaus in Quantum
Ans\"atze [3.2773906224402802]
We formulate a theory of Barren Plateaus for parameterized quantum circuits whose observables lie in their Lie algebra (DLA)
For the first time, our theory provides, for the first time, the ability to compute the variance of the variance of the gradient cost function of the quantum compound ansatz.
arXiv Detail & Related papers (2023-09-14T17:50:04Z) - Wigner's Phase Space Current for Variable Beam Splitters -Seeing Beam Splitters in a New Light- [0.0]
We study the behaviour of variable beam splitters and their dynamics using Wigner's phase space distribution, W.
We derive the form of the corresponding Wigner current, J, of each outgoing mode after tracing out the other.
The influence of the modes on each other is analyzed and visualized using their respective Wigner distributions and Wigner currents.
arXiv Detail & Related papers (2023-08-13T07:50:32Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - The Boomerang Sampler [4.588028371034406]
This paper introduces the Boomerang Sampler as a novel class of continuous-time non-reversible Markov chain Monte Carlo algorithms.
We demonstrate that the method is easy to implement and demonstrate empirically that it can out-perform existing benchmark piecewise deterministic Markov processes.
arXiv Detail & Related papers (2020-06-24T14:52:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.