Averaging on the Bures-Wasserstein manifold: dimension-free convergence
of gradient descent
- URL: http://arxiv.org/abs/2106.08502v3
- Date: Tue, 31 Oct 2023 01:57:07 GMT
- Title: Averaging on the Bures-Wasserstein manifold: dimension-free convergence
of gradient descent
- Authors: Jason M. Altschuler, Sinho Chewi, Patrik Gerber, Austin J. Stromme
- Abstract summary: We prove new geodesic convexity results which provide stronger control of the iterates, a free convergence.
Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median.
- Score: 15.136397170510834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study first-order optimization algorithms for computing the barycenter of
Gaussian distributions with respect to the optimal transport metric. Although
the objective is geodesically non-convex, Riemannian GD empirically converges
rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP
solvers. This stands in stark contrast to the best-known theoretical results
for Riemannian GD, which depend exponentially on the dimension. In this work,
we prove new geodesic convexity results which provide stronger control of the
iterates, yielding a dimension-free convergence rate. Our techniques also
enable the analysis of two related notions of averaging, the
entropically-regularized barycenter and the geometric median, providing the
first convergence guarantees for Riemannian GD for these problems.
Related papers
- On Barycenter Computation: Semi-Unbalanced Optimal Transport-based Method on Gaussians [24.473522267391072]
We develop algorithms on Bures-Wasserstein manifold, named the Exact Geodesic Gradient Descent and Hybrid Gradient Descent algorithms.
We establish theoretical convergence guarantees for both methods and demonstrate that the Exact Geodesic Gradient Descent algorithm attains a dimension-free convergence rate.
arXiv Detail & Related papers (2024-10-10T17:01:57Z) - Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Decentralized Riemannian natural gradient methods with Kronecker-product
approximations [11.263837420265594]
We present an efficient decentralized natural gradient descent (DRNGD) method for solving decentralized manifold optimization problems.
By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost.
arXiv Detail & Related papers (2023-03-16T19:36:31Z) - First-Order Algorithms for Min-Max Optimization in Geodesic Metric
Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting.
We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - On the Local Linear Rate of Consensus on the Stiefel Manifold [39.750623187256735]
We restrict the convergence of properties of the Stfelian gradient over the Stiefel problem (for an un connected problem)
The main challenges include (i) developing a technical for analysis, and (ii) to identify the conditions (e.g., suitable step-size and under which the always stays in the local region) under which the always stays in the local region.
arXiv Detail & Related papers (2021-01-22T21:52:38Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z) - GradientDICE: Rethinking Generalized Offline Estimation of Stationary
Values [75.17074235764757]
We present GradientDICE for estimating the density ratio between the state distribution of the target policy and the sampling distribution.
GenDICE is the state-of-the-art for estimating such density ratios.
arXiv Detail & Related papers (2020-01-29T22:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.