Efficient Generative Modeling via Penalized Optimal Transport Network
- URL: http://arxiv.org/abs/2402.10456v2
- Date: Tue, 07 Jan 2025 10:03:08 GMT
- Title: Efficient Generative Modeling via Penalized Optimal Transport Network
- Authors: Wenhui Sophia Lu, Chenyang Zhong, Wing Hung Wong,
- Abstract summary: We propose a versatile deep generative model based on the marginally-penalized Wasserstein (MPW) distance.
Through the MPW distance, POTNet effectively leverages low-dimensional marginal information to guide the overall alignment of joint distributions.
We derive a non-asymptotic bound on the generalization error of the MPW loss and establish convergence rates of the generative distribution learned by POTNet.
- Score: 1.8079016557290342
- License:
- Abstract: The generation of synthetic data with distributions that faithfully emulate the underlying data-generating mechanism holds paramount significance. Wasserstein Generative Adversarial Networks (WGANs) have emerged as a prominent tool for this task; however, due to the delicate equilibrium of the minimax formulation and the instability of Wasserstein distance in high dimensions, WGAN often manifests the pathological phenomenon of mode collapse. This results in generated samples that converge to a restricted set of outputs and fail to adequately capture the tail behaviors of the true distribution. Such limitations can lead to serious downstream consequences. To this end, we propose the Penalized Optimal Transport Network (POTNet), a versatile deep generative model based on the marginally-penalized Wasserstein (MPW) distance. Through the MPW distance, POTNet effectively leverages low-dimensional marginal information to guide the overall alignment of joint distributions. Furthermore, our primal-based framework enables direct evaluation of the MPW distance, thus eliminating the need for a critic network. This formulation circumvents training instabilities inherent in adversarial approaches and avoids the need for extensive parameter tuning. We derive a non-asymptotic bound on the generalization error of the MPW loss and establish convergence rates of the generative distribution learned by POTNet. Our theoretical analysis together with extensive empirical evaluations demonstrate the superior performance of POTNet in accurately capturing underlying data structures, including their tail behaviors and minor modalities. Moreover, our model achieves orders of magnitude speedup during the sampling stage compared to state-of-the-art alternatives, which enables computationally efficient large-scale synthetic data generation.
Related papers
- Network reconstruction via the minimum description length principle [0.0]
We propose an alternative nonparametric regularization scheme based on hierarchical Bayesian inference and weight quantization.
Our approach follows the minimum description length (MDL) principle, and uncovers the weight distribution that allows for the most compression of the data.
We demonstrate that our scheme yields systematically increased accuracy in the reconstruction of both artificial and empirical networks.
arXiv Detail & Related papers (2024-05-02T05:35:09Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - Estimating Regression Predictive Distributions with Sample Networks [17.935136717050543]
A common approach to model uncertainty is to choose a parametric distribution and fit the data to it using maximum likelihood estimation.
The chosen parametric form can be a poor fit to the data-generating distribution, resulting in unreliable uncertainty estimates.
We propose SampleNet, a flexible and scalable architecture for modeling uncertainty that avoids specifying a parametric form on the output distribution.
arXiv Detail & Related papers (2022-11-24T17:23:29Z) - Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models [32.52492468276371]
We propose regularized deep generative model (Reg-DGM) to reduce the variance of generative modeling with limited data.
Reg-DGM uses a pre-trained model to optimize a weighted sum of a certain divergence and the expectation of an energy function.
Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data.
arXiv Detail & Related papers (2022-08-30T10:28:50Z) - Compound Density Networks for Risk Prediction using Electronic Health
Records [1.1786249372283562]
We propose an integrated end-to-end approach by utilizing a Compound Density Network (CDNet)
CDNet allows the imputation method and prediction model to be tuned together within a single framework.
We validate CDNet on the mortality prediction task on the MIMIC-III dataset.
arXiv Detail & Related papers (2022-08-02T09:04:20Z) - Truncated tensor Schatten p-norm based approach for spatiotemporal
traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers.
Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z) - Inferential Wasserstein Generative Adversarial Networks [9.859829604054127]
We introduce a novel inferential Wasserstein GAN (iWGAN) model, which is a principled framework to fuse auto-encoders and WGANs.
The iWGAN greatly mitigates the symptom of mode collapse, speeds up the convergence, and is able to provide a measurement of quality check for each individual sample.
arXiv Detail & Related papers (2021-09-13T00:43:21Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z) - Distribution Approximation and Statistical Estimation Guarantees of
Generative Adversarial Networks [82.61546580149427]
Generative Adversarial Networks (GANs) have achieved a great success in unsupervised learning.
This paper provides approximation and statistical guarantees of GANs for the estimation of data distributions with densities in a H"older space.
arXiv Detail & Related papers (2020-02-10T16:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.