Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold
- URL: http://arxiv.org/abs/2308.10547v3
- Date: Tue, 12 Mar 2024 10:58:02 GMT
- Title: Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold
- Authors: Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor
W.Tsang, Yong Liu
- Abstract summary: This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
- Score: 59.73080197971106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The conjugate gradient method is a crucial first-order optimization method
that generally converges faster than the steepest descent method, and its
computational cost is much lower than that of second-order methods. However,
while various types of conjugate gradient methods have been studied in
Euclidean spaces and on Riemannian manifolds, there is little study for those
in distributed scenarios. This paper proposes a decentralized Riemannian
conjugate gradient descent (DRCGD) method that aims at minimizing a global
function over the Stiefel manifold. The optimization problem is distributed
among a network of agents, where each agent is associated with a local
function, and the communication between agents occurs over an undirected
connected graph. Since the Stiefel manifold is a non-convex set, a global
function is represented as a finite sum of possibly non-convex (but smooth)
local functions. The proposed method is free from expensive Riemannian
geometric operations such as retractions, exponential maps, and vector
transports, thereby reducing the computational complexity required by each
agent. To the best of our knowledge, DRCGD is the first decentralized
Riemannian conjugate gradient algorithm to achieve global convergence over the
Stiefel manifold.
Related papers
- FORML: A Riemannian Hessian-free Method for Meta-learning on Stiefel Manifolds [4.757859522106933]
This paper introduces a Hessian-free approach that uses a first-order approximation of derivatives on the Stiefel manifold.
Our method significantly reduces the computational load and memory footprint.
arXiv Detail & Related papers (2024-02-28T10:57:30Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Decentralized Riemannian natural gradient methods with Kronecker-product
approximations [11.263837420265594]
We present an efficient decentralized natural gradient descent (DRNGD) method for solving decentralized manifold optimization problems.
By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost.
arXiv Detail & Related papers (2023-03-16T19:36:31Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro.
Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold.
We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z) - Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds [71.94111815357064]
In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
arXiv Detail & Related papers (2021-03-27T19:56:00Z) - Decentralized Riemannian Gradient Descent on the Stiefel Manifold [39.750623187256735]
We consider a distributed non-sensian optimization where a network of agents aims at minimizing a global function over the Stiefel manifold.
To have exact with constant use, we also propose a decentralized gradient (DRA) for the Stiefel manifold.
arXiv Detail & Related papers (2021-02-14T07:30:23Z) - On the Local Linear Rate of Consensus on the Stiefel Manifold [39.750623187256735]
We restrict the convergence of properties of the Stfelian gradient over the Stiefel problem (for an un connected problem)
The main challenges include (i) developing a technical for analysis, and (ii) to identify the conditions (e.g., suitable step-size and under which the always stays in the local region) under which the always stays in the local region.
arXiv Detail & Related papers (2021-01-22T21:52:38Z) - Riemannian Stochastic Proximal Gradient Methods for Nonsmooth
Optimization over the Stiefel Manifold [7.257751371276488]
R-ProxSGD and R-ProxSPB are generalizations of proximal SGD and proximal SpiderBoost.
R-ProxSPB algorithm finds an $epsilon$-stationary point with $O(epsilon-3)$ IFOs in the online case, and $O(n+sqrtnepsilon-3)$ IFOs in the finite-sum case.
arXiv Detail & Related papers (2020-05-03T23:41:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.