A mean-field games laboratory for generative modeling
- URL: http://arxiv.org/abs/2304.13534v5
- Date: Tue, 24 Oct 2023 17:21:36 GMT
- Title: A mean-field games laboratory for generative modeling
- Authors: Benjamin J. Zhang and Markos A. Katsoulakis
- Abstract summary: Mean-field games (MFGs) are a framework for explaining, enhancing, and designing generative models.
We study the mathematical properties of each generative model by studying their associated MFG's optimality condition.
We propose and demonstrate an HJB-regularized SGM with improved performance over standard SGMs.
- Score: 5.837881923712394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We demonstrate the versatility of mean-field games (MFGs) as a mathematical
framework for explaining, enhancing, and designing generative models. In
generative flows, a Lagrangian formulation is used where each particle
(generated sample) aims to minimize a loss function over its simulated path.
The loss, however, is dependent on the paths of other particles, which leads to
a competition among the population of particles. The asymptotic behavior of
this competition yields a mean-field game. We establish connections between
MFGs and major classes of generative flows and diffusions including
continuous-time normalizing flows, score-based generative models (SGM), and
Wasserstein gradient flows. Furthermore, we study the mathematical properties
of each generative model by studying their associated MFG's optimality
condition, which is a set of coupled forward-backward nonlinear partial
differential equations. The mathematical structure described by the MFG
optimality conditions identifies the inductive biases of generative flows. We
investigate the well-posedness and structure of normalizing flows, unravel the
mathematical structure of SGMs, and derive a MFG formulation of Wasserstein
gradient flows. From an algorithmic perspective, the optimality conditions
yields Hamilton-Jacobi-Bellman (HJB) regularizers for enhanced training of
generative models. In particular, we propose and demonstrate an HJB-regularized
SGM with improved performance over standard SGMs. We present this framework as
an MFG laboratory which serves as a platform for revealing new avenues of
experimentation and invention of generative models.
Related papers
- Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space [72.52365911990935]
We introduce Bellman Diffusion, a novel DGM framework that maintains linearity in MDPs through gradient and scalar field modeling.
Our results show that Bellman Diffusion achieves accurate field estimations and is a capable image generator, converging 1.5x faster than the traditional histogram-based baseline in distributional RL tasks.
arXiv Detail & Related papers (2024-10-02T17:53:23Z) - Wasserstein proximal operators describe score-based generative models
and resolve memorization [12.321631823103894]
We first formulate SGMs in terms of the Wasserstein proximal operator (WPO)
We show that WPO describes the inductive bias of diffusion and score-based models.
We present an interpretable kernel-based model for the score function which dramatically improves the performance of SGMs.
arXiv Detail & Related papers (2024-02-09T03:33:13Z) - Diffusion Model Conditioning on Gaussian Mixture Model and Negative
Gaussian Mixture Gradient [1.9298401192674903]
Diffusion models (DMs) are a type of generative model that has a huge impact on image synthesis and beyond.
We propose a conditioning mechanism utilizing Gaussian mixture models (GMMs) as feature conditioning to guide the denoising process.
We show that conditional latent distribution based on features and classes is significantly different, so that conditional latent distribution on features produces fewer defect generations than conditioning on classes.
arXiv Detail & Related papers (2024-01-20T16:01:18Z) - Riemannian Score-Based Generative Modeling [56.20669989459281]
We introduce score-based generative models (SGMs) demonstrating remarkable empirical performance.
Current SGMs make the underlying assumption that the data is supported on a Euclidean manifold with flat geometry.
This prevents the use of these models for applications in robotics, geoscience or protein modeling.
arXiv Detail & Related papers (2022-02-06T11:57:39Z) - Moser Flow: Divergence-based Generative Modeling on Manifolds [49.04974733536027]
Moser Flow (MF) is a new class of generative models within the family of continuous normalizing flows (CNF)
MF does not require invoking or backpropagating through an ODE solver during training.
We demonstrate for the first time the use of flow models for sampling from general curved surfaces.
arXiv Detail & Related papers (2021-08-18T09:00:24Z) - Refining Deep Generative Models via Discriminator Gradient Flow [18.406499703293566]
Discriminator Gradient flow (DGflow) is a new technique that improves generated samples via the gradient flow of entropy-regularized f-divergences.
We show that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models.
arXiv Detail & Related papers (2020-12-01T19:10:15Z) - Discriminator Contrastive Divergence: Semi-Amortized Generative Modeling
by Exploring Energy of the Discriminator [85.68825725223873]
Generative Adversarial Networks (GANs) have shown great promise in modeling high dimensional data.
We introduce the Discriminator Contrastive Divergence, which is well motivated by the property of WGAN's discriminator.
We demonstrate the benefits of significant improved generation on both synthetic data and several real-world image generation benchmarks.
arXiv Detail & Related papers (2020-04-05T01:50:16Z) - Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging.
We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence.
Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.