Combining Wasserstein-1 and Wasserstein-2 proximals: robust manifold learning via well-posed generative flows
- URL: http://arxiv.org/abs/2407.11901v1
- Date: Tue, 16 Jul 2024 16:34:31 GMT
- Title: Combining Wasserstein-1 and Wasserstein-2 proximals: robust manifold learning via well-posed generative flows
- Authors: Hyemin Gu, Markos A. Katsoulakis, Luc Rey-Bellet, Benjamin J. Zhang,
- Abstract summary: We formulate well-posed continuous-time generative flows for learning distributions supported on low-dimensional manifold.
We show that the Wasserstein-1 proximal operator regularize $f$-divergences so that singular distributions can be compared.
We also show that the Wasserstein-2 proximal operator regularize the paths of the generative flows by adding an optimal transport cost.
- Score: 6.799748192975493
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We formulate well-posed continuous-time generative flows for learning distributions that are supported on low-dimensional manifolds through Wasserstein proximal regularizations of $f$-divergences. Wasserstein-1 proximal operators regularize $f$-divergences so that singular distributions can be compared. Meanwhile, Wasserstein-2 proximal operators regularize the paths of the generative flows by adding an optimal transport cost, i.e., a kinetic energy penalization. Via mean-field game theory, we show that the combination of the two proximals is critical for formulating well-posed generative flows. Generative flows can be analyzed through optimality conditions of a mean-field game (MFG), a system of a backward Hamilton-Jacobi (HJ) and a forward continuity partial differential equations (PDEs) whose solution characterizes the optimal generative flow. For learning distributions that are supported on low-dimensional manifolds, the MFG theory shows that the Wasserstein-1 proximal, which addresses the HJ terminal condition, and the Wasserstein-2 proximal, which addresses the HJ dynamics, are both necessary for the corresponding backward-forward PDE system to be well-defined and have a unique solution with provably linear flow trajectories. This implies that the corresponding generative flow is also unique and can therefore be learned in a robust manner even for learning high-dimensional distributions supported on low-dimensional manifolds. The generative flows are learned through adversarial training of continuous-time flows, which bypasses the need for reverse simulation. We demonstrate the efficacy of our approach for generating high-dimensional images without the need to resort to autoencoders or specialized architectures.
Related papers
- Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence [54.580605276017096]
Rectified Flow (RF) aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems.
RF theoretically straightens the trajectory through successive rectifications, reducing the number of evaluations function (NFEs) while sampling.
We provide the first theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution.
arXiv Detail & Related papers (2024-10-19T02:36:11Z) - Generative Modeling by Minimizing the Wasserstein-2 Loss [1.2277343096128712]
This paper approaches the unsupervised learning problem by minimizing the second-order Wasserstein loss (the $W$ loss) through a distribution-dependent ordinary differential equation (ODE)
A main result shows that the time-marginal laws of the ODE form a gradient flow for the $W$ loss, which converges exponentially to the true data distribution.
An algorithm is designed by following the scheme and applying persistent training, which naturally fits our gradient-flow approach.
arXiv Detail & Related papers (2024-06-19T15:15:00Z) - Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space [17.13355049019388]
We extend the gradient flow on Wasserstein space into the gradient descent (SGD) flow and variance reduction (SVRG) flow.
By leveraging the property of Wasserstein space, we construct differential equations to approximate the corresponding discrete dynamics in Euclidean space.
Our results are proven, which match the results in Euclidean space.
arXiv Detail & Related papers (2024-01-24T15:35:44Z) - Convergence of flow-based generative models via proximal gradient descent in Wasserstein space [20.771897445580723]
Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood.
We provide a theoretical guarantee of generating data distribution by a progressive flow model.
arXiv Detail & Related papers (2023-10-26T17:06:23Z) - Learning to Accelerate Partial Differential Equations via Latent Global
Evolution [64.72624347511498]
Latent Evolution of PDEs (LE-PDE) is a simple, fast and scalable method to accelerate the simulation and inverse optimization of PDEs.
We introduce new learning objectives to effectively learn such latent dynamics to ensure long-term stability.
We demonstrate up to 128x reduction in the dimensions to update, and up to 15x improvement in speed, while achieving competitive accuracy.
arXiv Detail & Related papers (2022-06-15T17:31:24Z) - Smooth Normalizing Flows [0.0]
We introduce a class of smooth mixture transformations working on both compact intervals and hypertori.
We show that such inverses can be computed from forward evaluations via the inverse function theorem.
We demonstrate two advantages of such smooth flows: they allow training by force matching to simulation data and can be used as potentials in molecular dynamics simulations.
arXiv Detail & Related papers (2021-10-01T12:27:14Z) - Generative Flows with Invertible Attentions [135.23766216657745]
We introduce two types of invertible attention mechanisms for generative flow models.
We exploit split-based attention mechanisms to learn the attention weights and input representations on every two splits of flow feature maps.
Our method provides invertible attention modules with tractable Jacobian determinants, enabling seamless integration of it at any positions of the flow-based models.
arXiv Detail & Related papers (2021-06-07T20:43:04Z) - Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows.
Our approach relies on input neural networks (ICNNs) to discretize the JKO steps.
As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z) - SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows [78.77808270452974]
SurVAE Flows is a modular framework for composable transformations that encompasses VAEs and normalizing flows.
We show that several recently proposed methods, including dequantization and augmented normalizing flows, can be expressed as SurVAE Flows.
arXiv Detail & Related papers (2020-07-06T13:13:22Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.