Decentralized Riemannian natural gradient methods with Kronecker-product
approximations
- URL: http://arxiv.org/abs/2303.09611v1
- Date: Thu, 16 Mar 2023 19:36:31 GMT
- Title: Decentralized Riemannian natural gradient methods with Kronecker-product
approximations
- Authors: Jiang Hu, Kangkang Deng, Na Li, Quanzheng Li
- Abstract summary: We present an efficient decentralized natural gradient descent (DRNGD) method for solving decentralized manifold optimization problems.
By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost.
- Score: 11.263837420265594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With a computationally efficient approximation of the second-order
information, natural gradient methods have been successful in solving
large-scale structured optimization problems. We study the natural gradient
methods for the large-scale decentralized optimization problems on Riemannian
manifolds, where the local objective function defined by the local dataset is
of a log-probability type. By utilizing the structure of the Riemannian Fisher
information matrix (RFIM), we present an efficient decentralized Riemannian
natural gradient descent (DRNGD) method. To overcome the communication issue of
the high-dimension RFIM, we consider a class of structured problems for which
the RFIM can be approximated by a Kronecker product of two low-dimension
matrices. By performing the communications over the Kronecker factors, a
high-quality approximation of the RFIM can be obtained in a low cost. We prove
that DRNGD converges to a stationary point with the best-known rate of
$\mathcal{O}(1/K)$. Numerical experiments demonstrate the efficiency of our
proposed method compared with the state-of-the-art ones. To the best of our
knowledge, this is the first Riemannian second-order method for solving
decentralized manifold optimization problems.
Related papers
- FORML: A Riemannian Hessian-free Method for Meta-learning on Stiefel Manifolds [4.757859522106933]
This paper introduces a Hessian-free approach that uses a first-order approximation of derivatives on the Stiefel manifold.
Our method significantly reduces the computational load and memory footprint.
arXiv Detail & Related papers (2024-02-28T10:57:30Z) - Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - DRSOM: A Dimension Reduced Second-Order Method [13.778619250890406]
Under a trust-like framework, our method preserves the convergence of the second-order method while using only information in a few directions.
Theoretically, we show that the method has a local convergence and a global convergence rate of $O(epsilon-3/2)$ to satisfy the first-order and second-order conditions.
arXiv Detail & Related papers (2022-07-30T13:05:01Z) - First-Order Algorithms for Min-Max Optimization in Geodesic Metric
Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting.
We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z) - Averaging on the Bures-Wasserstein manifold: dimension-free convergence
of gradient descent [15.136397170510834]
We prove new geodesic convexity results which provide stronger control of the iterates, a free convergence.
Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median.
arXiv Detail & Related papers (2021-06-16T01:05:19Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Riemannian Stochastic Proximal Gradient Methods for Nonsmooth
Optimization over the Stiefel Manifold [7.257751371276488]
R-ProxSGD and R-ProxSPB are generalizations of proximal SGD and proximal SpiderBoost.
R-ProxSPB algorithm finds an $epsilon$-stationary point with $O(epsilon-3)$ IFOs in the online case, and $O(n+sqrtnepsilon-3)$ IFOs in the finite-sum case.
arXiv Detail & Related papers (2020-05-03T23:41:35Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.