GANs as Gradient Flows that Converge
- URL: http://arxiv.org/abs/2205.02910v2
- Date: Mon, 20 Mar 2023 06:49:13 GMT
- Title: GANs as Gradient Flows that Converge
- Authors: Yu-Jui Huang, Yuchong Zhang
- Abstract summary: We show that along the gradient flow induced by a distribution-dependent ordinary differential equation, the unknown data distribution emerges as the long-time limit.
The simulation of the ODE is shown equivalent to the training of generative networks (GANs)
This equivalence provides a new "cooperative" view of GANs and, more importantly, sheds new light on the divergence of GANs.
- Score: 3.8707695363745223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper approaches the unsupervised learning problem by gradient descent
in the space of probability density functions. A main result shows that along
the gradient flow induced by a distribution-dependent ordinary differential
equation (ODE), the unknown data distribution emerges as the long-time limit.
That is, one can uncover the data distribution by simulating the
distribution-dependent ODE. Intriguingly, the simulation of the ODE is shown
equivalent to the training of generative adversarial networks (GANs). This
equivalence provides a new "cooperative" view of GANs and, more importantly,
sheds new light on the divergence of GANs. In particular, it reveals that the
GAN algorithm implicitly minimizes the mean squared error (MSE) between two
sets of samples, and this MSE fitting alone can cause GANs to diverge. To
construct a solution to the distribution-dependent ODE, we first show that the
associated nonlinear Fokker-Planck equation has a unique weak solution, by the
Crandall-Liggett theorem for differential equations in Banach spaces. Based on
this solution to the Fokker-Planck equation, we construct a unique solution to
the ODE, using Trevisan's superposition principle. The convergence of the
induced gradient flow to the data distribution is obtained by analyzing the
Fokker-Planck equation.
Related papers
- Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence [54.580605276017096]
Diffusion models have emerged as a powerful tool for image generation and denoising.
Recently, Liu et al. designed a novel alternative generative model Rectified Flow (RF)
RF aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems.
arXiv Detail & Related papers (2024-10-19T02:36:11Z) - Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors [0.0]
Diffusion or score-based models recently showed high performance in image generation.
We study theoretically the behavior of diffusion models and their numerical implementation when the data distribution is Gaussian.
arXiv Detail & Related papers (2024-05-23T07:28:56Z) - On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - On the Computation of the Gaussian Rate-Distortion-Perception Function [10.564071872770146]
We study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source under mean squared error (MSE) distortion.
We provide the associated algorithmic realization, as well as the convergence and the rate of convergence characterization.
We corroborate our results with numerical simulations and draw connections to existing results.
arXiv Detail & Related papers (2023-11-15T18:34:03Z) - Noise-Free Sampling Algorithms via Regularized Wasserstein Proximals [3.4240632942024685]
We consider the problem of sampling from a distribution governed by a potential function.
This work proposes an explicit score based MCMC method that is deterministic, resulting in a deterministic evolution for particles.
arXiv Detail & Related papers (2023-08-28T23:51:33Z) - Error Bounds for Flow Matching Methods [38.9898500163582]
Flow matching methods approximate a flow between two arbitrary probability distributions.
We present error bounds for the flow matching procedure using fully deterministic sampling, assuming an $L2$ bound on the approximation error and a certain regularity on the data distributions.
arXiv Detail & Related papers (2023-05-26T12:13:53Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.
DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z) - Stationary Density Estimation of It\^o Diffusions Using Deep Learning [6.8342505943533345]
We consider the density estimation problem associated with the stationary measure of ergodic Ito diffusions from a discrete-time series.
We employ deep neural networks to approximate the drift and diffusion terms of the SDE.
We establish the convergence of the proposed scheme under appropriate mathematical assumptions.
arXiv Detail & Related papers (2021-09-09T01:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.