Communication-Efficient Federated Group Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2410.06369v1
- Date: Tue, 8 Oct 2024 21:07:53 GMT
- Title: Communication-Efficient Federated Group Distributionally Robust Optimization
- Authors: Zhishuai Guo, Tianbao Yang,
- Abstract summary: Federated learning faces challenges due to the heterogeneity in data volumes and distributions at different clients.
Existing approaches to address this issue based on group distributionally robust optimization (GDRO) often lead to high communication and sample complexity.
This work introduces algorithms tailored for communication-efficient Federated Group Distributionally Robust Optimization.
- Score: 49.14751984342068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning faces challenges due to the heterogeneity in data volumes and distributions at different clients, which can compromise model generalization ability to various distributions. Existing approaches to address this issue based on group distributionally robust optimization (GDRO) often lead to high communication and sample complexity. To this end, this work introduces algorithms tailored for communication-efficient Federated Group Distributionally Robust Optimization (FGDRO). Our contributions are threefold: Firstly, we introduce the FGDRO-CVaR algorithm, which optimizes the average top-K losses while reducing communication complexity to $O(1/\epsilon^4)$, where $\epsilon$ denotes the desired precision level. Secondly, our FGDRO-KL algorithm is crafted to optimize KL regularized FGDRO, cutting communication complexity to $O(1/\epsilon^3)$. Lastly, we propose FGDRO-KL-Adam to utilize Adam-type local updates in FGDRO-KL, which not only maintains a communication cost of $O(1/\epsilon^3)$ but also shows potential to surpass SGD-type local steps in practical applications. The effectiveness of our algorithms has been demonstrated on a variety of real-world tasks, including natural language processing and computer vision.
Related papers
- FedDRO: Federated Compositional Optimization for Distributionally Robust
Learning [11.70892315284039]
Large-scale and distributed availability of data demands the development of efficient federated learning gradient algorithms.
We propose efficient FedAvg-type algorithms for solving non- linear compositional gradients in the FL setting.
A key novelty of our work is to develop solution accuracy-independent algorithms that do not require large batch evaluations.
arXiv Detail & Related papers (2023-11-21T14:53:39Z) - Scalable Distributional Robustness in a Class of Non Convex Optimization
with Guarantees [7.541571634887807]
Distributionally robust optimization (DRO) has shown robustness in learning as well as sample based problems.
We propose a mixed-integer clustering program (MISOCP) which does not scale enough to solve problems with real worldsets.
arXiv Detail & Related papers (2022-05-31T09:07:01Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z) - On the Convergence of Heterogeneous Federated Learning with Arbitrary
Adaptive Online Model Pruning [15.300983585090794]
We present a unifying framework for heterogeneous FL algorithms with em arbitrary adaptive online model pruning.
In particular, we prove that under certain sufficient conditions, these algorithms converge to a stationary point of standard FL for general smooth cost functions.
We illuminate two key factors impacting convergence: pruning-induced noise and minimum coverage index.
arXiv Detail & Related papers (2022-01-27T20:43:38Z) - Federated Distributionally Robust Optimization for Phase Configuration
of RISs [106.4688072667105]
We study the problem of robust reconfigurable intelligent surface (RIS)-aided downlink communication over heterogeneous RIS types in a supervised learning setting.
By modeling downlink communication over heterogeneous RIS designs as different workers that learn how to optimize phase configurations in a distributed manner, we solve this distributed learning problem.
Our proposed algorithm requires fewer communication rounds to achieve the same worst-case distribution test accuracy compared to competitive baselines.
arXiv Detail & Related papers (2021-08-20T07:07:45Z) - A Stochastic Alternating Balance $k$-Means Algorithm for Fair Clustering [0.0]
In the application of data clustering to human-centric decision-making systems, such as loan applications and advertisement, the clustering outcome might discriminate against people across different demographic groups.
We propose a novel alternating balance mini-batch $k$-means (SAKM) algorithm, which consists of $k$-means updates and group swap updates.
arXiv Detail & Related papers (2021-05-29T01:47:15Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.