Learning heavy-tailed distributions with Wasserstein-proximal-regularized $α$-divergences
- URL: http://arxiv.org/abs/2405.13962v1
- Date: Wed, 22 May 2024 19:58:13 GMT
- Title: Learning heavy-tailed distributions with Wasserstein-proximal-regularized $α$-divergences
- Authors: Ziyu Chen, Hyemin Gu, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu,
- Abstract summary: We propose Wasserstein proximals of $alpha$-divergences as suitable objective functionals for learning heavy-tailed distributions.
Heuristically, $alpha$-divergences handle the heavy tails and Wasserstein proximals allow non-absolute continuity between distributions.
- Score: 12.19634962193403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose Wasserstein proximals of $\alpha$-divergences as suitable objective functionals for learning heavy-tailed distributions in a stable manner. First, we provide sufficient, and in some cases necessary, relations among data dimension, $\alpha$, and the decay rate of data distributions for the Wasserstein-proximal-regularized divergence to be finite. Finite-sample convergence rates for the estimation in the case of the Wasserstein-1 proximal divergences are then provided under certain tail conditions. Numerical experiments demonstrate stable learning of heavy-tailed distributions -- even those without first or second moment -- without any explicit knowledge of the tail behavior, using suitable generative models such as GANs and flow-based models related to our proposed Wasserstein-proximal-regularized $\alpha$-divergences. Heuristically, $\alpha$-divergences handle the heavy tails and Wasserstein proximals allow non-absolute continuity between distributions and control the velocities of flow-based algorithms as they learn the target distribution deep into the tails.
Related papers
- Convergence of Continuous Normalizing Flows for Learning Probability Distributions [10.381321024264484]
Continuous normalizing flows (CNFs) are a generative method for learning probability distributions.
We study the theoretical properties of CNFs with linear regularity in learning probability distributions from a finite random sample.
We present a convergence analysis framework that encompasses the error due to velocity estimation, the discretization error, and the early stopping error.
arXiv Detail & Related papers (2024-03-31T03:39:04Z) - Deep conditional distribution learning via conditional Föllmer flow [3.227277661633986]
We introduce an ordinary differential equation (ODE) based deep generative method for learning conditional distributions, named Conditional F"ollmer Flow.
For effective implementation, we discretize the flow with Euler's method where we estimate the velocity field nonparametrically using a deep neural network.
arXiv Detail & Related papers (2024-02-02T14:52:10Z) - Convergence of flow-based generative models via proximal gradient descent in Wasserstein space [20.771897445580723]
Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood.
We provide a theoretical guarantee of generating data distribution by a progressive flow model.
arXiv Detail & Related papers (2023-10-26T17:06:23Z) - Sobolev Space Regularised Pre Density Models [51.558848491038916]
We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density.
This method is statistically consistent, and makes the inductive validation model clear and consistent.
arXiv Detail & Related papers (2023-07-25T18:47:53Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Wasserstein distance estimates for the distributions of numerical
approximations to ergodic stochastic differential equations [0.3553493344868413]
We study the Wasserstein distance between the in distribution of an ergodic differential equation and the distribution in the strongly log-concave case.
This allows us to study in a unified way a number of different approximations proposed in the literature for the overdamped and underdamped Langevin dynamics.
arXiv Detail & Related papers (2021-04-26T07:50:04Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - $(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and
Integral Probability Metrics [6.221019624345409]
We develop a framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs)
We show that they can be expressed as a two-stage mass-redistribution/mass-transport process.
Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions.
arXiv Detail & Related papers (2020-11-11T18:17:09Z) - Continuous Regularized Wasserstein Barycenters [51.620781112674024]
We introduce a new dual formulation for the regularized Wasserstein barycenter problem.
We establish strong duality and use the corresponding primal-dual relationship to parametrize the barycenter implicitly using the dual potentials of regularized transport problems.
arXiv Detail & Related papers (2020-08-28T08:28:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.