Sinkhorn Natural Gradient for Generative Models
- URL: http://arxiv.org/abs/2011.04162v1
- Date: Mon, 9 Nov 2020 02:51:17 GMT
- Title: Sinkhorn Natural Gradient for Generative Models
- Authors: Zebang Shen and Zhenfu Wang and Alejandro Ribeiro and Hamed Hassani
- Abstract summary: We propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically.
In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
- Score: 125.89871274202439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of minimizing a functional over a parametric family
of probability measures, where the parameterization is characterized via a
push-forward structure. An important application of this problem is in training
generative adversarial networks. In this regard, we propose a novel Sinkhorn
Natural Gradient (SiNG) algorithm which acts as a steepest descent method on
the probability space endowed with the Sinkhorn divergence. We show that the
Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit
expression and can be evaluated accurately in complexity that scales
logarithmically with respect to the desired accuracy. This is in sharp contrast
to existing natural gradient methods that can only be carried out
approximately. Moreover, in practical applications when only Monte-Carlo type
integration is available, we design an empirical estimator for SIM and provide
the stability analysis. In our experiments, we quantitatively compare SiNG with
state-of-the-art SGD-type solvers on generative tasks to demonstrate its
efficiency and efficacy of our method.
Related papers
- Nonparametric estimation of Hawkes processes with RKHSs [1.775610745277615]
This paper addresses nonparametric estimation of nonlinear Hawkes processes, where the interaction functions are assumed to lie in a reproducing kernel space (RKHS)
Motivated by applications in neuroscience, the model allows complex interaction functions, in order to express exciting and inhibiting effects, but also a combination of both.
It shows that our method achieves a better performance compared to related nonparametric estimation techniques and suits neuronal applications.
arXiv Detail & Related papers (2024-11-01T14:26:50Z) - Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent [0.3831327965422187]
This paper proposes a novel approach to adaptive step sizes in gradient descent (SGD)
We use quantities that we have identified as numerically traceable -- the Lipschitz constant for gradients and a concept of the local variance in search directions.
arXiv Detail & Related papers (2023-11-28T17:03:56Z) - Smoothing Methods for Automatic Differentiation Across Conditional
Branches [0.0]
Smooth interpretation (SI) approximates the convolution of a program's output with a Gaussian kernel, thus smoothing its output in a principled manner.
We combine SI with automatic differentiation (AD) to efficiently compute gradients of smoothed programs.
We propose a novel Monte Carlo estimator that avoids the underlying assumptions by estimating the smoothed programs' gradients through a combination of AD and sampling.
arXiv Detail & Related papers (2023-10-05T15:08:37Z) - Simulation-based inference using surjective sequential neural likelihood
estimation [50.24983453990065]
Surjective Sequential Neural Likelihood estimation is a novel method for simulation-based inference.
By embedding the data in a low-dimensional space, SSNL solves several issues previous likelihood-based methods had when applied to high-dimensional data sets.
arXiv Detail & Related papers (2023-08-02T10:02:38Z) - Numerically Stable Sparse Gaussian Processes via Minimum Separation
using Cover Trees [57.67528738886731]
We study the numerical stability of scalable sparse approximations based on inducing points.
For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions.
arXiv Detail & Related papers (2022-10-14T15:20:17Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - Scalable Stochastic Parametric Verification with Stochastic Variational
Smoothed Model Checking [1.5293427903448025]
Smoothed model checking (smMC) aims at inferring the satisfaction function over the entire parameter space from a limited set of observations.
In this paper, we exploit recent advances in probabilistic machine learning to push this limitation forward.
We compare the performances of smMC against those of SV-smMC by looking at the scalability, the computational efficiency and the accuracy of the reconstructed satisfaction function.
arXiv Detail & Related papers (2022-05-11T10:43:23Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Convergence and sample complexity of gradient methods for the model-free
linear quadratic regulator problem [27.09339991866556]
We show that ODE searches for optimal control for an unknown computation system by directly searching over the corresponding space of controllers.
We take a step towards demystifying the performance and efficiency of such methods by focusing on the gradient-flow dynamics set of stabilizing feedback gains and a similar result holds for the forward disctization of the ODE.
arXiv Detail & Related papers (2019-12-26T16:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.