Related papers: Discrete distributions are learnable from metastable samples

Discrete distributions are learnable from metastable samples

URL: http://arxiv.org/abs/2410.13800v2
Date: Tue, 10 Dec 2024 04:22:38 GMT
Title: Discrete distributions are learnable from metastable samples
Authors: Abhijith Jayakumar, Andrey Y. Lokhov, Sidhant Misra, Marc Vuffray,
Abstract summary: We rigorously show that the true model describing the stationary distribution can be recovered from samples produced from a metastable distribution.<n>This follows from a fundamental observation that the single-variable conditionals of metastable distributions that satisfy a strong metastability condition are on average close to those of the stationary distribution.<n>Explicit examples of such metastable states can be constructed from regions that effectively bottleneck the probability flow and cause poor mixing of the Markov chain.
Score: 8.924669503280333
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Physically motivated stochastic dynamics are often used to sample from high-dimensional distributions. However such dynamics often get stuck in specific regions of their state space and mix very slowly to the desired stationary state. This causes such systems to approximately sample from a metastable distribution which is usually quite different from the desired, stationary distribution of the dynamic. We rigorously show that, in the case of multi-variable discrete distributions, the true model describing the stationary distribution can be recovered from samples produced from a metastable distribution under minimal assumptions about the system. This follows from a fundamental observation that the single-variable conditionals of metastable distributions that satisfy a strong metastability condition are on average close to those of the stationary distribution. This holds even when the metastable distribution differs considerably from the true model in terms of global metrics like Kullback-Leibler divergence or total variation distance. This property allows us to learn the true model using a conditional likelihood based estimator, even when the samples come from a metastable distribution concentrated in a small region of the state space. Explicit examples of such metastable states can be constructed from regions that effectively bottleneck the probability flow and cause poor mixing of the Markov chain. For specific cases of binary pairwise undirected graphical models (i.e. Ising models), we extend our results to further rigorously show that data coming from metastable states can be used to learn the parameters of the energy function and recover the structure of the model.

Related papers

Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning [79.65014491424151]
We propose a quantum Discrete Denoising Diffusion Probabilistic Model (QD3PM)<n>It enables joint probability learning through diffusion and denoising in exponentially large Hilbert spaces.<n>This paper establishes a new theoretical paradigm in generative models by leveraging the quantum advantage in joint distribution learning.
arXiv Detail & Related papers (2025-05-08T11:48:21Z)
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
Hessian-Informed Flow Matching [4.542719108171107]
Hessian-Informed Flow Matching is a novel approach that integrates the Hessian of an energy function into conditional flows. This integration allows HI-FM to account for local curvature and anisotropic covariance structures. Empirical evaluations on the MNIST and Lennard-Jones particles datasets demonstrate that HI-FM improves the likelihood of test samples.
arXiv Detail & Related papers (2024-10-15T09:34:52Z)
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework. We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points. Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z)
Stochastic Sampling from Deterministic Flow Models [8.849981177332594]
We present a method to turn flow models into a family of differential equations (SDEs) that have the same marginal distributions. We empirically demonstrate advantages of our method on a toy Gaussian setup and on the large scale ImageNet generation task.
arXiv Detail & Related papers (2024-10-03T05:18:28Z)
Conditional sampling within generative diffusion models [12.608803080528142]
We present a review of existing computational approaches to conditional sampling within generative diffusion models. We highlight key methodologies that either utilise the joint distribution, or rely on (pre-trained) marginal distributions with explicit likelihoods.
arXiv Detail & Related papers (2024-09-15T07:48:40Z)
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data [55.54827581105283]
We show that the concrete score in absorbing diffusion can be expressed as conditional probabilities of clean data. We propose a dedicated diffusion model without time-condition that characterizes the time-independent conditional probabilities. Our models achieve SOTA performance among diffusion models on 5 zero-shot language modeling benchmarks.
arXiv Detail & Related papers (2024-06-06T04:22:11Z)
Ai-Sampler: Adversarial Learning of Markov kernels with involutive maps [28.229819253644862]
We propose a method to parameterize and train transition kernels of Markov chains to achieve efficient sampling and good mixing. This training procedure minimizes the total variation distance between the stationary distribution of the chain and the empirical distribution of the data.
arXiv Detail & Related papers (2024-06-04T17:00:14Z)
Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points. Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems. We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases. We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z)
Sampling, Diffusions, and Stochastic Localization [10.368585938419619]
Diffusions are a successful technique to sample from high-dimensional distributions. localization is a technique to prove mixing of Markov Chains and other functional inequalities in high dimension. An algorithmic version of localization was introduced in [EAMS2022] to obtain an algorithm that samples from certain statistical mechanics models.
arXiv Detail & Related papers (2023-05-18T04:01:40Z)
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z)
Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain. We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions. We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z)
Score-Based Diffusion meets Annealed Importance Sampling [89.92133671626327]
Annealed Importance Sampling remains one of the most effective methods for marginal likelihood estimation. We leverage recent progress in score-based generative modeling to approximate the optimal extended target distribution for AIS proposals.
arXiv Detail & Related papers (2022-08-16T12:13:29Z)
Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions. We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z)
A likelihood approach to nonparametric estimation of a singular distribution using deep generative models [4.329951775163721]
We investigate a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. We prove that a novel and effective solution exists by perturbing the data with an instance noise. We also characterize the class of distributions that can be efficiently estimated via deep generative models.
arXiv Detail & Related papers (2021-05-09T23:13:58Z)
Targeted stochastic gradient Markov chain Monte Carlo for hidden Markov models with rare latent states [48.705095800341944]
Markov chain Monte Carlo (MCMC) algorithms for hidden Markov models often rely on the forward-backward sampler. This makes them computationally slow as the length of the time series increases, motivating the development of sub-sampling-based approaches. We propose a targeted sub-sampling approach that over-samples observations corresponding to rare latent states when calculating the gradient of parameters associated with them.
arXiv Detail & Related papers (2018-10-31T17:44:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.