Riemannian Federated Learning via Averaging Gradient Streams
- URL: http://arxiv.org/abs/2409.07223v2
- Date: Sun, 19 Oct 2025 09:41:08 GMT
- Title: Riemannian Federated Learning via Averaging Gradient Streams
- Authors: Zhenwei Huang, Wen Huang, Pratik Jawanpuria, Bamdev Mishra,
- Abstract summary: Federated learning (FL) as a distributed learning paradigm has a significant advantage in addressing large-scale machine learning tasks.<n>This paper presents and analyzes a new efficient server aggregation -- averaging gradient streams, called RFedAGS.<n>We show that the proposed RFedAGS has global convergence and sublinear convergence rate under decaying step sizes cases.
- Score: 10.533809913888591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) as a distributed learning paradigm has a significant advantage in addressing large-scale machine learning tasks. In the Euclidean setting, FL algorithms have been extensively studied with both theoretical and empirical success. However, there exist few works that investigate federated learning algorithms in the Riemannian setting. In particular, critical challenges such as partial participation and data heterogeneity among agents are not explored in the Riemannian federated setting. This paper presents and analyzes a Riemannian FL algorithm, called RFedAGS, based on a new efficient server aggregation -- averaging gradient streams, which can simultaneously handle partial participation and data heterogeneity. We theoretically show that the proposed RFedAGS has global convergence and sublinear convergence rate under decaying step sizes cases; and converges sublinearly/linearly to a neighborhood of a stationary point/solution under fixed step sizes cases. These analyses are based on a vital and non-trivial assumption induced by partial participation, which is shown to hold with high probability. Extensive experiments conducted on synthetic and real-world data demonstrate the good performance of RFedAGS.
Related papers
- Iso-Riemannian Optimization on Learned Data Manifolds [6.345340156849189]
We introduce a principled framework for optimization on learned data manifold using iso-Riemannian geometry.<n>We show that our approach yields interpretable barycentres, improved clustering, and provably efficient solutions to inverse problems.<n>These results establish that optimization under iso-Riemannian geometry can overcome distortions inherent to learned manifold mappings.
arXiv Detail & Related papers (2025-10-23T22:34:55Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - A Hierarchical Federated Learning Approach for the Internet of Things [3.28418927821443]
We present a novel federated learning solution, QHetFed, suitable for large-scale Internet of Things deployments.<n>We show that QHetFed consistently achieves high learning accuracy and significantly outperforms other hierarchical algorithms.
arXiv Detail & Related papers (2024-03-03T15:40:24Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Every Parameter Matters: Ensuring the Convergence of Federated Learning
with Dynamic Heterogeneous Models Reduction [22.567754688492414]
Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks.
Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly.
This paper presents a unifying framework for heterogeneous FL algorithms with online model extraction and provides a general convergence analysis for the first time.
arXiv Detail & Related papers (2023-10-12T19:07:58Z) - Implicit regularization in AI meets generalized hardness of
approximation in optimization -- Sharp results for diagonal linear networks [0.0]
We show sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks.
We link this to the phenomenon of phase transitions in generalized hardness of approximation.
Non-sharpness of our results would imply that the GHA phenomenon would not occur for the basis pursuit optimization problem.
arXiv Detail & Related papers (2023-07-13T13:27:51Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Pairwise Learning via Stagewise Training in Proximal Setting [0.0]
We combine adaptive sample size and importance sampling techniques for pairwise learning, with convergence guarantees for nonsmooth convex pairwise loss functions.
We demonstrate that sampling opposite instances at each reduces the variance of the gradient, hence accelerating convergence.
arXiv Detail & Related papers (2022-08-08T11:51:01Z) - Riemannian Natural Gradient Methods [21.14740680011498]
We introduce the notion of Fisher information matrix in the manifold setting, which can be viewed as a natural extension of the natural gradient method.
We establish the almost-sure global convergence of our proposed method under standard assumptions.
Numerical experiments on applications arising from machine learning demonstrate the advantages of the proposed method over state-of-the-art ones.
arXiv Detail & Related papers (2022-07-15T04:33:10Z) - First-Order Algorithms for Min-Max Optimization in Geodesic Metric
Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting.
We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - On Distributed Non-convex Optimization: Projected Subgradient Method For
Weakly Convex Problems in Networks [13.385373310554327]
The Moreau subgradient method converges linear sharpness problems in machine learning.
A distributed implementation of the subgradient method with a theoretical guarantee is proposed.
arXiv Detail & Related papers (2020-04-28T01:01:49Z) - Self-Concordant Analysis of Frank-Wolfe Algorithms [3.3598755777055374]
In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature.
We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations.
arXiv Detail & Related papers (2020-02-11T11:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.