On Deep Generative Models for Approximation and Estimation of
Distributions on Manifolds
- URL: http://arxiv.org/abs/2302.13183v1
- Date: Sat, 25 Feb 2023 22:34:19 GMT
- Title: On Deep Generative Models for Approximation and Estimation of
Distributions on Manifolds
- Authors: Biraj Dahal, Alex Havrilla, Minshuo Chen, Tuo Zhao, Wenjing Liao
- Abstract summary: Generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution.
We take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold.
We show that the Wasserstein-1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension.
- Score: 38.311376714689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative networks have experienced great empirical successes in
distribution learning. Many existing experiments have demonstrated that
generative networks can generate high-dimensional complex data from a
low-dimensional easy-to-sample distribution. However, this phenomenon can not
be justified by existing theories. The widely held manifold hypothesis
speculates that real-world data sets, such as natural images and signals,
exhibit low-dimensional geometric structures. In this paper, we take such
low-dimensional data structures into consideration by assuming that data
distributions are supported on a low-dimensional manifold. We prove statistical
guarantees of generative networks under the Wasserstein-1 loss. We show that
the Wasserstein-1 loss converges to zero at a fast rate depending on the
intrinsic dimension instead of the ambient data dimension. Our theory leverages
the low-dimensional geometric structures in data sets and justifies the
practical power of generative networks. We require no smoothness assumptions on
the data distribution which is desirable in practice.
Related papers
- Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions [6.9408143976091745]
Denoising Diffusion Probabilistic Models (DDPM) are powerful state-of-the-art methods used to generate synthetic data from high-dimensional data distributions.
We study DDPMs under the manifold hypothesis and prove that they achieve rates independent of the ambient dimension in terms of learning the score.
In terms of sampling, we obtain rates independent of the ambient dimension w.r.t. the Kullback-Leibler divergence, and $O(sqrtD)$ w.r.t. the Wasserstein distance.
arXiv Detail & Related papers (2024-09-27T14:57:18Z) - Adaptive Learning of the Latent Space of Wasserstein Generative Adversarial Networks [7.958528596692594]
We propose a novel framework called the latent Wasserstein GAN (LWGAN)
It fuses the Wasserstein auto-encoder and the Wasserstein GAN so that the intrinsic dimension of the data manifold can be adaptively learned.
We show that LWGAN is able to identify the correct intrinsic dimension under several scenarios.
arXiv Detail & Related papers (2024-09-27T01:25:22Z) - Hardness of Learning Neural Networks under the Manifold Hypothesis [3.2635082758250693]
manifold hypothesis presumes that high-dimensional data lies on or near a low-dimensional manifold.
We investigate the hardness of learning under the manifold hypothesis.
We show that additional assumptions on the volume of the data manifold alleviate these fundamental limitations.
arXiv Detail & Related papers (2024-06-03T15:50:32Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Intrinsic Dimension Estimation [92.87600241234344]
We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees.
We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending on the intrinsic dimension of the data.
arXiv Detail & Related papers (2021-06-08T00:05:39Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - Learning a Deep Part-based Representation by Preserving Data
Distribution [21.13421736154956]
Unsupervised dimensionality reduction is one of the commonly used techniques in the field of high dimensional data recognition problems.
In this paper, by preserving the data distribution, a deep part-based representation can be learned, and the novel algorithm is called Distribution Preserving Network Embedding.
The experimental results on the real-world data sets show that the proposed algorithm has good performance in terms of cluster accuracy and AMI.
arXiv Detail & Related papers (2020-09-17T12:49:36Z) - Normal-bundle Bootstrap [2.741266294612776]
We present a method that generates new data which preserve the geometric structure of a given data set.
Inspired by algorithms for manifold learning and concepts in differential geometry, our method decomposes the underlying probability measure into a marginalized measure.
We apply our method to the inference of density ridge and related statistics, and data augmentation to reduce overfitting.
arXiv Detail & Related papers (2020-07-27T21:14:19Z) - Distribution Approximation and Statistical Estimation Guarantees of
Generative Adversarial Networks [82.61546580149427]
Generative Adversarial Networks (GANs) have achieved a great success in unsupervised learning.
This paper provides approximation and statistical guarantees of GANs for the estimation of data distributions with densities in a H"older space.
arXiv Detail & Related papers (2020-02-10T16:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.