Sinkhorn Natural Gradient for Generative Models
- URL: http://arxiv.org/abs/2011.04162v1
- Date: Mon, 9 Nov 2020 02:51:17 GMT
- Title: Sinkhorn Natural Gradient for Generative Models
- Authors: Zebang Shen and Zhenfu Wang and Alejandro Ribeiro and Hamed Hassani
- Abstract summary: We propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically.
In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
- Score: 125.89871274202439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of minimizing a functional over a parametric family
of probability measures, where the parameterization is characterized via a
push-forward structure. An important application of this problem is in training
generative adversarial networks. In this regard, we propose a novel Sinkhorn
Natural Gradient (SiNG) algorithm which acts as a steepest descent method on
the probability space endowed with the Sinkhorn divergence. We show that the
Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit
expression and can be evaluated accurately in complexity that scales
logarithmically with respect to the desired accuracy. This is in sharp contrast
to existing natural gradient methods that can only be carried out
approximately. Moreover, in practical applications when only Monte-Carlo type
integration is available, we design an empirical estimator for SIM and provide
the stability analysis. In our experiments, we quantitatively compare SiNG with
state-of-the-art SGD-type solvers on generative tasks to demonstrate its
efficiency and efficacy of our method.
Related papers
- Smoothing Methods for Automatic Differentiation Across Conditional
Branches [0.0]
Smooth interpretation (SI) approximates the convolution of a program's output with a Gaussian kernel, thus smoothing its output in a principled manner.
We combine SI with automatic differentiation (AD) to efficiently compute gradients of smoothed programs.
We propose a novel Monte Carlo estimator that avoids the underlying assumptions by estimating the smoothed programs' gradients through a combination of AD and sampling.
arXiv Detail & Related papers (2023-10-05T15:08:37Z) - Simulation-based inference using surjective sequential neural likelihood
estimation [50.24983453990065]
Surjective Sequential Neural Likelihood estimation is a novel method for simulation-based inference.
By embedding the data in a low-dimensional space, SSNL solves several issues previous likelihood-based methods had when applied to high-dimensional data sets.
arXiv Detail & Related papers (2023-08-02T10:02:38Z) - Numerically Stable Sparse Gaussian Processes via Minimum Separation
using Cover Trees [57.67528738886731]
We study the numerical stability of scalable sparse approximations based on inducing points.
For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions.
arXiv Detail & Related papers (2022-10-14T15:20:17Z) - Statistical Learning and Inverse Problems: An Stochastic Gradient
Approach [0.0]
Inverse problems are paramount in Science and Engineering.
In this paper, we consider the setup of Statistical Inverse Problem (SIP) and demonstrate how Gradient Descent (SGD) algorithms can be used in the linear SIP setting.
arXiv Detail & Related papers (2022-09-29T17:42:01Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - Scalable Stochastic Parametric Verification with Stochastic Variational
Smoothed Model Checking [1.5293427903448025]
Smoothed model checking (smMC) aims at inferring the satisfaction function over the entire parameter space from a limited set of observations.
In this paper, we exploit recent advances in probabilistic machine learning to push this limitation forward.
We compare the performances of smMC against those of SV-smMC by looking at the scalability, the computational efficiency and the accuracy of the reconstructed satisfaction function.
arXiv Detail & Related papers (2022-05-11T10:43:23Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Isotropic SGD: a Practical Approach to Bayesian Posterior Sampling [18.64160180251004]
This work defines a unified mathematical framework to deepen our understanding of the role of gradient (SG) noise on the behavior of Markov chain Monte Carlo (SGMCMC) algorithms.
Our formulation unlocks the design of a novel, practical approach to posterior sampling, which makes the SG noise isotropic using a fixed learning rate.
Our proposal is competitive with the state-of-the-art on sgmcmc, while being much more practical to use.
arXiv Detail & Related papers (2020-06-09T07:31:21Z) - Convergence and sample complexity of gradient methods for the model-free
linear quadratic regulator problem [27.09339991866556]
We show that ODE searches for optimal control for an unknown computation system by directly searching over the corresponding space of controllers.
We take a step towards demystifying the performance and efficiency of such methods by focusing on the gradient-flow dynamics set of stabilizing feedback gains and a similar result holds for the forward disctization of the ODE.
arXiv Detail & Related papers (2019-12-26T16:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.