Related papers: Understanding Entropic Regularization in GANs

Understanding Entropic Regularization in GANs

URL: http://arxiv.org/abs/2111.01387v1
Date: Tue, 2 Nov 2021 06:08:16 GMT
Title: Understanding Entropic Regularization in GANs
Authors: Daria Reshetova, Yikun Bai, Xiugang Wu, Ayfer Ozgur
Abstract summary: We study the influence of regularization on the learned solution of Wasserstein distance. We show that entropy regularization promotes the solution sparsification, while replacing the Wasserstein distance with the Sinkhorn divergence recovers the unregularized solution. We conclude that these regularization techniques can improve the quality of the generator learned from empirical data for a large class of distributions.
Score: 5.448283690603358
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative Adversarial Networks are a popular method for learning distributions from data by modeling the target distribution as a function of a known distribution. The function, often referred to as the generator, is optimized to minimize a chosen distance measure between the generated and target distributions. One commonly used measure for this purpose is the Wasserstein distance. However, Wasserstein distance is hard to compute and optimize, and in practice entropic regularization techniques are used to improve numerical convergence. The influence of regularization on the learned solution, however, remains not well-understood. In this paper, we study how several popular entropic regularizations of Wasserstein distance impact the solution in a simple benchmark setting where the generator is linear and the target distribution is high-dimensional Gaussian. We show that entropy regularization promotes the solution sparsification, while replacing the Wasserstein distance with the Sinkhorn divergence recovers the unregularized solution. Both regularization techniques remove the curse of dimensionality suffered by Wasserstein distance. We show that the optimal generator can be learned to accuracy $\epsilon$ with $O(1/\epsilon^2)$ samples from the target distribution. We thus conclude that these regularization techniques can improve the quality of the generator learned from empirical data for a large class of distributions.

Related papers

Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence [54.580605276017096]
Diffusion models have emerged as a powerful tool for image generation and denoising. Recently, Liu et al. designed a novel alternative generative model Rectified Flow (RF) RF aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems.
arXiv Detail & Related papers (2024-10-19T02:36:11Z)
Adversarial Likelihood Estimation With One-Way Flows [44.684952377918904]
Generative Adversarial Networks (GANs) can produce high-quality samples, but do not provide an estimate of the probability density around the samples. We show that our method converges faster, produces comparable sample quality to GANs with similar architecture, successfully avoids over-fitting to commonly used datasets and produces smooth low-dimensional latent representations of the training data.
arXiv Detail & Related papers (2023-07-19T10:26:29Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average. Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z)
Nonlinear Sufficient Dimension Reduction for Distribution-on-Distribution Regression [9.086237593805173]
We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data. Our key step is to build universal kernels (cc-universal) on the metric spaces.
arXiv Detail & Related papers (2022-07-11T04:11:36Z)
Robust Estimation for Nonparametric Families via Generative Adversarial Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems. Our work extend these to robust mean estimation, second moment estimation, and robust linear regression. In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z)
Projected Sliced Wasserstein Autoencoder-based Hyperspectral Images Anomaly Detection [42.585075865267946]
We propose the Projected Sliced Wasserstein (PSW) autoencoder-based anomaly detection method. In particular, the computation-friendly eigen-decomposition method is leveraged to find the principal component for slicing the high-dimensional data. Comprehensive experiments conducted on various real-world hyperspectral anomaly detection benchmarks demonstrate the superior performance of the proposed method.
arXiv Detail & Related papers (2021-12-20T09:21:02Z)
Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections [19.987683989865708]
The Sliced-Wasserstein distance (SW) is being increasingly used in machine learning applications. We propose a new perspective to approximate SW by making use of the concentration of measure phenomenon. Our method does not require sampling a number of random projections, and is therefore both accurate and easy to use compared to the usual Monte Carlo approximation.
arXiv Detail & Related papers (2021-06-29T13:56:19Z)
Non-asymptotic convergence bounds for Wasserstein approximation using point clouds [0.0]
We show how to generate discrete data as if sampled from a model probability distribution. We provide explicit upper bounds for the convergence-type algorithm.
arXiv Detail & Related papers (2021-06-15T06:53:08Z)
Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization [106.70006655990176]
A distributional optimization problem arises widely in machine learning and statistics. We propose a novel particle-based algorithm, dubbed as variational transport, which approximately performs Wasserstein gradient descent. We prove that when the objective function satisfies a functional version of the Polyak-Lojasiewicz (PL) (Polyak, 1963) and smoothness conditions, variational transport converges linearly.
arXiv Detail & Related papers (2020-12-21T18:33:13Z)
Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization [101.5159744660701]
In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data. Here, we introduce a new technique for debiasing the local estimates, which leads to both theoretical and empirical improvements in the convergence rate of distributed second order methods.
arXiv Detail & Related papers (2020-07-02T18:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.