Tractable structured natural gradient descent using local
parameterizations
- URL: http://arxiv.org/abs/2102.07405v1
- Date: Mon, 15 Feb 2021 09:09:20 GMT
- Title: Tractable structured natural gradient descent using local
parameterizations
- Authors: Wu Lin, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt
- Abstract summary: Natural-gradient descent on structured parameter spaces is computationally challenging due to complicated inverse Fisher-matrix computations.
We address this issue by using emphlocal- parameter coordinates.
We show results on a range of applications on deep learning, variational inference, and evolution strategies.
- Score: 43.51581051770027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural-gradient descent on structured parameter spaces (e.g., low-rank
covariances) is computationally challenging due to complicated inverse
Fisher-matrix computations. We address this issue for optimization, inference,
and search problems by using \emph{local-parameter coordinates}. Our method
generalizes an existing evolutionary-strategy method, recovers Newton and
Riemannian-gradient methods as special cases, and also yields new tractable
natural-gradient algorithms for learning flexible covariance structures of
Gaussian and Wishart-based distributions. We show results on a range of
applications on deep learning, variational inference, and evolution strategies.
Our work opens a new direction for scalable structured geometric methods via
local parameterizations.
Related papers
- Gradient-free neural topology optimization [0.0]
gradient-free algorithms require many more iterations to converge when compared to gradient-based algorithms.
This has made them unviable for topology optimization due to the high computational cost per iteration and high dimensionality of these problems.
We propose a pre-trained neural reparameterization strategy that leads to at least one order of magnitude decrease in iteration count when optimizing the designs in latent space.
arXiv Detail & Related papers (2024-03-07T23:00:49Z) - FORML: A Riemannian Hessian-free Method for Meta-learning on Stiefel Manifolds [4.757859522106933]
This paper introduces a Hessian-free approach that uses a first-order approximation of derivatives on the Stiefel manifold.
Our method significantly reduces the computational load and memory footprint.
arXiv Detail & Related papers (2024-02-28T10:57:30Z) - Unnatural Algorithms in Machine Learning [0.0]
We show that optimization algorithms with this property can be viewed as discrete approximations of natural gradient descent.
We introduce a simple method of introducing this naturality more generally and examine a number of popular machine learning training algorithms.
arXiv Detail & Related papers (2023-12-07T22:43:37Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro.
Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold.
We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z) - Structured second-order methods via natural gradient descent [43.51581051770027]
We propose new structured second-order methods and structured adaptive-gradient methods.
Natural-gradient descent is an attractive approach to design algorithms in many settings.
arXiv Detail & Related papers (2021-07-22T19:03:53Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods [7.486132958737807]
We present an adaptive variance reduced method with an implicit approach for adaptivity.
We provide convergence guarantees for finite-sum minimization problems and show a faster convergence than SARAH can be achieved if local geometry permits.
This algorithm implicitly computes step-size and efficiently estimates local Lipschitz smoothness of functions.
arXiv Detail & Related papers (2021-02-19T01:17:15Z) - Local optimization on pure Gaussian state manifolds [63.76263875368856]
We exploit insights into the geometry of bosonic and fermionic Gaussian states to develop an efficient local optimization algorithm.
The method is based on notions of descent gradient attuned to the local geometry.
We use the presented methods to collect numerical and analytical evidence for the conjecture that Gaussian purifications are sufficient to compute the entanglement of purification of arbitrary mixed Gaussian states.
arXiv Detail & Related papers (2020-09-24T18:00:36Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.