Communication-Efficient Federated Group Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2410.06369v2
- Date: Mon, 11 Nov 2024 21:42:53 GMT
- Title: Communication-Efficient Federated Group Distributionally Robust Optimization
- Authors: Zhishuai Guo, Tianbao Yang,
- Abstract summary: Federated learning faces challenges due to the heterogeneity in data volumes and distributions at different clients.
Existing approaches to address this issue based on group distributionally robust optimization (GDRO) often lead to high communication and sample complexity.
This work introduces algorithms tailored for communication-efficient Federated Group Distributionally Robust Optimization.
- Score: 49.14751984342068
- License:
- Abstract: Federated learning faces challenges due to the heterogeneity in data volumes and distributions at different clients, which can compromise model generalization ability to various distributions. Existing approaches to address this issue based on group distributionally robust optimization (GDRO) often lead to high communication and sample complexity. To this end, this work introduces algorithms tailored for communication-efficient Federated Group Distributionally Robust Optimization (FGDRO). Our contributions are threefold: Firstly, we introduce the FGDRO-CVaR algorithm, which optimizes the average top-K losses while reducing communication complexity to $O(1/\epsilon^4)$, where $\epsilon$ denotes the desired precision level. Secondly, our FGDRO-KL algorithm is crafted to optimize KL regularized FGDRO, cutting communication complexity to $O(1/\epsilon^3)$. Lastly, we propose FGDRO-KL-Adam to utilize Adam-type local updates in FGDRO-KL, which not only maintains a communication cost of $O(1/\epsilon^3)$ but also shows potential to surpass SGD-type local steps in practical applications. The effectiveness of our algorithms has been demonstrated on a variety of real-world tasks, including natural language processing and computer vision.
Related papers
- FedDRO: Federated Compositional Optimization for Distributionally Robust
Learning [11.70892315284039]
Large-scale and distributed availability of data demands the development of efficient federated learning gradient algorithms.
We propose efficient FedAvg-type algorithms for solving non- linear compositional gradients in the FL setting.
A key novelty of our work is to develop solution accuracy-independent algorithms that do not require large batch evaluations.
arXiv Detail & Related papers (2023-11-21T14:53:39Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Scalable Distributional Robustness in a Class of Non Convex Optimization
with Guarantees [7.541571634887807]
Distributionally robust optimization (DRO) has shown robustness in learning as well as sample based problems.
We propose a mixed-integer clustering program (MISOCP) which does not scale enough to solve problems with real worldsets.
arXiv Detail & Related papers (2022-05-31T09:07:01Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z) - On the Convergence of Heterogeneous Federated Learning with Arbitrary
Adaptive Online Model Pruning [15.300983585090794]
We present a unifying framework for heterogeneous FL algorithms with em arbitrary adaptive online model pruning.
In particular, we prove that under certain sufficient conditions, these algorithms converge to a stationary point of standard FL for general smooth cost functions.
We illuminate two key factors impacting convergence: pruning-induced noise and minimum coverage index.
arXiv Detail & Related papers (2022-01-27T20:43:38Z) - Federated Distributionally Robust Optimization for Phase Configuration
of RISs [106.4688072667105]
We study the problem of robust reconfigurable intelligent surface (RIS)-aided downlink communication over heterogeneous RIS types in a supervised learning setting.
By modeling downlink communication over heterogeneous RIS designs as different workers that learn how to optimize phase configurations in a distributed manner, we solve this distributed learning problem.
Our proposed algorithm requires fewer communication rounds to achieve the same worst-case distribution test accuracy compared to competitive baselines.
arXiv Detail & Related papers (2021-08-20T07:07:45Z) - A Stochastic Alternating Balance $k$-Means Algorithm for Fair Clustering [0.0]
In the application of data clustering to human-centric decision-making systems, such as loan applications and advertisement, the clustering outcome might discriminate against people across different demographic groups.
We propose a novel alternating balance mini-batch $k$-means (SAKM) algorithm, which consists of $k$-means updates and group swap updates.
arXiv Detail & Related papers (2021-05-29T01:47:15Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.