Streamlining in the Riemannian Realm: Efficient Riemannian Optimization
with Loopless Variance Reduction
- URL: http://arxiv.org/abs/2403.06677v1
- Date: Mon, 11 Mar 2024 12:49:37 GMT
- Title: Streamlining in the Riemannian Realm: Efficient Riemannian Optimization
with Loopless Variance Reduction
- Authors: Yury Demidovich, Grigory Malinovsky, Peter Richt\'arik
- Abstract summary: This study focuses on the crucial reduction mechanism used in both Euclidean and Riemannian settings.
Motivated by Euclidean methods, we introduce R-based methods to replace the outer loop with computation triggered by a coin flip.
Using R- as a framework, we demonstrate its applicability to various important settings.
- Score: 4.578425862931332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we investigate stochastic optimization on Riemannian
manifolds, focusing on the crucial variance reduction mechanism used in both
Euclidean and Riemannian settings. Riemannian variance-reduced methods usually
involve a double-loop structure, computing a full gradient at the start of each
loop. Determining the optimal inner loop length is challenging in practice, as
it depends on strong convexity or smoothness constants, which are often unknown
or hard to estimate. Motivated by Euclidean methods, we introduce the
Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods
replace the outer loop with probabilistic gradient computation triggered by a
coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter
selection, and sharp convergence guarantees. Using R-PAGE as a framework for
non-convex Riemannian optimization, we demonstrate its applicability to various
important settings. For example, we derive Riemannian MARINA (R-MARINA) for
distributed settings with communication compression, providing the best
theoretical communication complexity guarantees for non-convex distributed
optimization over Riemannian manifolds. Experimental results support our
theoretical findings.
Related papers
- Riemannian Bilevel Optimization [35.42472057648458]
We focus in particular on batch and gradient-based methods, with the explicit goal of avoiding second-order information.
We propose and analyze $mathrmRF2SA$, a method that leverages first-order gradient information.
We provide explicit convergence rates for reaching $epsilon$-stationary points under various setups.
arXiv Detail & Related papers (2024-05-22T20:49:01Z) - Posterior Contraction Rates for Mat\'ern Gaussian Processes on
Riemannian Manifolds [51.68005047958965]
We show that intrinsic Gaussian processes can achieve better performance in practice.
Our work shows that finer-grained analyses are needed to distinguish between different levels of data-efficiency.
arXiv Detail & Related papers (2023-09-19T20:30:58Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Faster Riemannian Newton-type Optimization by Subsampling and Cubic
Regularization [3.867143522757309]
This work is on constrained large-scale non-constrained optimization where the constraint set implies a manifold structure.
We propose a new second-order saddleian optimization algorithm, aiming at improving convergence and reducing computational cost.
arXiv Detail & Related papers (2023-02-22T00:37:44Z) - The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro.
Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold.
We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z) - First-Order Algorithms for Min-Max Optimization in Geodesic Metric
Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting.
We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z) - Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds [71.94111815357064]
In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
arXiv Detail & Related papers (2021-03-27T19:56:00Z) - Kernel Distributionally Robust Optimization [17.909696462645023]
We propose kernel distributionally robust optimization ( Kernel DRO) using insights from the robust optimization theory and functional analysis.
Our method uses kernel kernel (RKHS) to construct a wide range of convex ambiguity, which can be generalized to sets based on probability metrics and finite-order moment sets.
Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as losses and knowledge of the Lipschitz constant.
arXiv Detail & Related papers (2020-06-12T07:46:59Z) - Riemannian Stochastic Proximal Gradient Methods for Nonsmooth
Optimization over the Stiefel Manifold [7.257751371276488]
R-ProxSGD and R-ProxSPB are generalizations of proximal SGD and proximal SpiderBoost.
R-ProxSPB algorithm finds an $epsilon$-stationary point with $O(epsilon-3)$ IFOs in the online case, and $O(n+sqrtnepsilon-3)$ IFOs in the finite-sum case.
arXiv Detail & Related papers (2020-05-03T23:41:35Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.