Federated Minimax Optimization: Improved Convergence Analyses and
Algorithms
- URL: http://arxiv.org/abs/2203.04850v1
- Date: Wed, 9 Mar 2022 16:21:31 GMT
- Title: Federated Minimax Optimization: Improved Convergence Analyses and
Algorithms
- Authors: Pranay Sharma, Rohan Panda, Gauri Joshi and Pramod K. Varshney
- Abstract summary: We consider non minimax optimization, is gaining prominence many modern machine learning applications such as GANs.
We provide a novel and tighter analysis algorithm, improves convergence communication guarantees in the existing literature.
- Score: 32.062312674333775
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we consider nonconvex minimax optimization, which is gaining
prominence in many modern machine learning applications such as GANs.
Large-scale edge-based collection of training data in these applications calls
for communication-efficient distributed optimization algorithms, such as those
used in federated learning, to process the data. In this paper, we analyze
Local stochastic gradient descent ascent (SGDA), the local-update version of
the SGDA algorithm. SGDA is the core algorithm used in minimax optimization,
but it is not well-understood in a distributed setting. We prove that Local
SGDA has \textit{order-optimal} sample complexity for several classes of
nonconvex-concave and nonconvex-nonconcave minimax problems, and also enjoys
\textit{linear speedup} with respect to the number of clients. We provide a
novel and tighter analysis, which improves the convergence and communication
guarantees in the existing literature. For nonconvex-PL and
nonconvex-one-point-concave functions, we improve the existing complexity
results for centralized minimax problems. Furthermore, we propose a
momentum-based local-update algorithm, which has the same convergence
guarantees, but outperforms Local SGDA as demonstrated in our experiments.
Related papers
- Fast Decentralized Gradient Tracking for Federated Minimax Optimization with Local Updates [5.269633789700637]
textttK-GT-Minimax's ability to handle data heterogeneity underscores its significance in advancing federated learning research and applications.
arXiv Detail & Related papers (2024-05-07T17:25:56Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - Near-Optimal Decentralized Momentum Method for Nonconvex-PL Minimax
Problems [39.197569803430646]
Minimax optimization plays an important role in many machine learning tasks such as adversarial networks (GANs) and adversarial training.
Although recently a wide variety of optimization methods have been proposed to solve the minimax problems, most of them ignore the distributed setting.
arXiv Detail & Related papers (2023-04-21T11:38:41Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - A Communication-efficient Algorithm with Linear Convergence for
Federated Minimax Learning [1.713291434132985]
We study a large-scale multi-agent minimax optimization problem, which models Geneimation Adversarial Networks (GANs)
The overall objective is a sum of agents' private local objective functions.
We show that FedGDA-GT converges linearly with a constant stepsize to global $epsilon GDA solution.
arXiv Detail & Related papers (2022-06-02T16:31:16Z) - Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD
for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time.
We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z) - The Minimax Complexity of Distributed Optimization [0.0]
I present the "graph oracle model", an extension of the classic oracle framework that can be applied to distributed optimization.
I focus on the specific case of the "intermittent communication setting"
I analyze the theoretical properties of the popular Local Descent (SGD) algorithm in convex setting.
arXiv Detail & Related papers (2021-09-01T15:18:33Z) - Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Minimax
Problems [80.46370778277186]
Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks.
We develop a communication-efficient distributed extragrad algorithm, LocalAdaSient, with an adaptive learning rate suitable for solving convex-concave minimax problem in the.
Server model.
We demonstrate its efficacy through several experiments in both the homogeneous and heterogeneous settings.
arXiv Detail & Related papers (2021-06-18T09:42:05Z) - Second-Order Guarantees in Centralized, Federated and Decentralized
Nonconvex Optimization [64.26238893241322]
Simple algorithms have been shown to lead to good empirical results in many contexts.
Several works have pursued rigorous analytical justification for studying non optimization problems.
A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
arXiv Detail & Related papers (2020-03-31T16:54:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.