Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds
- URL: http://arxiv.org/abs/2103.14974v1
- Date: Sat, 27 Mar 2021 19:56:00 GMT
- Title: Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds
- Authors: Alexander Novikov, Maxim Rakhuba, Ivan Oseledets
- Abstract summary: In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
- Score: 71.94111815357064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In scientific computing and machine learning applications, matrices and more
general multidimensional arrays (tensors) can often be approximated with the
help of low-rank decompositions. Since matrices and tensors of fixed rank form
smooth Riemannian manifolds, one of the popular tools for finding the low-rank
approximations is to use the Riemannian optimization. Nevertheless, efficient
implementation of Riemannian gradients and Hessians, required in Riemannian
optimization algorithms, can be a nontrivial task in practice. Moreover, in
some cases, analytic formulas are not even available. In this paper, we build
upon automatic differentiation and propose a method that, given an
implementation of the function to be minimized, efficiently computes Riemannian
gradients and matrix-by-vector products between approximate Riemannian Hessian
and a given vector.
Related papers
- Riemannian coordinate descent algorithms on matrix manifolds [12.05722932030768]
We provide a general framework for developing computationally efficient coordinate descent (CD) algorithms on matrix manifold.
We propose CD algorithms for various manifold such as Stiefel, Grassmann, (generalized) hyperbolic, symplectic, and symmetric positive (semi)definite.
We analyze their convergence and complexity, and empirically illustrate their efficacy in several applications.
arXiv Detail & Related papers (2024-06-04T11:37:11Z) - Riemannian Bilevel Optimization [35.42472057648458]
We focus in particular on batch and gradient-based methods, with the explicit goal of avoiding second-order information.
We propose and analyze $mathrmRF2SA$, a method that leverages first-order gradient information.
We provide explicit convergence rates for reaching $epsilon$-stationary points under various setups.
arXiv Detail & Related papers (2024-05-22T20:49:01Z) - FORML: A Riemannian Hessian-free Method for Meta-learning on Stiefel Manifolds [4.757859522106933]
This paper introduces a Hessian-free approach that uses a first-order approximation of derivatives on the Stiefel manifold.
Our method significantly reduces the computational load and memory footprint.
arXiv Detail & Related papers (2024-02-28T10:57:30Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Low-complexity subspace-descent over symmetric positive definite
manifold [9.346050098365648]
We develop low-complexity algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold.
The proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix.
arXiv Detail & Related papers (2023-05-03T11:11:46Z) - On a class of geodesically convex optimization problems solved via
Euclidean MM methods [50.428784381385164]
We show how a difference of Euclidean convexization functions can be written as a difference of different types of problems in statistics and machine learning.
Ultimately, we helps the broader broader the broader the broader the broader the work.
arXiv Detail & Related papers (2022-06-22T23:57:40Z) - Geometry-aware Bayesian Optimization in Robotics using Riemannian
Mat\'ern Kernels [64.62221198500467]
We show how to implement geometry-aware kernels for Bayesian optimization.
This technique can be used for control parameter tuning, parametric policy adaptation, and structure design in robotics.
arXiv Detail & Related papers (2021-11-02T09:47:22Z) - Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems.
Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z) - Mat\'ern Gaussian processes on Riemannian manifolds [81.15349473870816]
We show how to generalize the widely-used Mat'ern class of Gaussian processes.
We also extend the generalization from the Mat'ern to the widely-used squared exponential process.
arXiv Detail & Related papers (2020-06-17T21:05:42Z) - Riemannian Stochastic Proximal Gradient Methods for Nonsmooth
Optimization over the Stiefel Manifold [7.257751371276488]
R-ProxSGD and R-ProxSPB are generalizations of proximal SGD and proximal SpiderBoost.
R-ProxSPB algorithm finds an $epsilon$-stationary point with $O(epsilon-3)$ IFOs in the online case, and $O(n+sqrtnepsilon-3)$ IFOs in the finite-sum case.
arXiv Detail & Related papers (2020-05-03T23:41:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.