Related papers: Solving a Class of Non-Convex Minimax Optimization in Federated Learning

Solving a Class of Non-Convex Minimax Optimization in Federated Learning

URL: http://arxiv.org/abs/2310.03613v1
Date: Thu, 5 Oct 2023 15:48:41 GMT
Title: Solving a Class of Non-Convex Minimax Optimization in Federated Learning
Authors: Xidong Wu, Jianhui Sun, Zhengmian Hu, Aidong Zhang, Heng Huang
Abstract summary: The minimax problems arise throughout machine learning applications, ranging from machine learning training to large-scale learning. We propose a class of algorithms for non minimax problems (emphi) that reduce complexity to $varepsilon-6)$. We prove that FedSGDA-M has the best sample complexity of $O(kappa2-3)$ and the best-known communication of $O(kappa2-3)$.
Score: 84.98927714326908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The minimax problems arise throughout machine learning applications, ranging from adversarial training and policy evaluation in reinforcement learning to AUROC maximization. To address the large-scale data challenges across multiple clients with communication-efficient distributed training, federated learning (FL) is gaining popularity. Many optimization algorithms for minimax problems have been developed in the centralized setting (\emph{i.e.} single-machine). Nonetheless, the algorithm for minimax problems under FL is still underexplored. In this paper, we study a class of federated nonconvex minimax optimization problems. We propose FL algorithms (FedSGDA+ and FedSGDA-M) and reduce existing complexity results for the most common minimax problems. For nonconvex-concave problems, we propose FedSGDA+ and reduce the communication complexity to $O(\varepsilon^{-6})$. Under nonconvex-strongly-concave and nonconvex-PL minimax settings, we prove that FedSGDA-M has the best-known sample complexity of $O(\kappa^{3} N^{-1}\varepsilon^{-3})$ and the best-known communication complexity of $O(\kappa^{2}\varepsilon^{-2})$. FedSGDA-M is the first algorithm to match the best sample complexity $O(\varepsilon^{-3})$ achieved by the single-machine method under the nonconvex-strongly-concave setting. Extensive experimental results on fair classification and AUROC maximization show the efficiency of our algorithms.

Related papers

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.<n>Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems [7.5573375809946395]
We consider a network of $m$ computing agents collaborate via peer-to-peer communications. Our algorithmic framework introduces agrangian multiplier to eliminate the consensus constraint on the dual variable. To the best of our knowledge, this is the first work which provides convergence guarantees for NCSC minimax problems with general non regularizers applied to both the primal and dual variables.
arXiv Detail & Related papers (2023-07-14T01:32:16Z)
Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization [41.28002701420715]
Minimax optimization has been widely applied in many machine learning tasks. We show that our methods have the best known sample complexity without relying on any specific types.
arXiv Detail & Related papers (2023-03-07T15:33:12Z)
PRECISION: Decentralized Constrained Min-Max Learning with Low Communication and Sample Complexities [25.153506493249854]
We show an adaptive multi-agent learning technique for min-max optimization problems. We also propose an algorithm called PRECISION that enjoys a reduction in the number of iterations.
arXiv Detail & Related papers (2023-03-05T00:26:10Z)
Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems. In this paper, we propose two types of minimax algorithms. For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z)
Decentralized Stochastic Gradient Descent Ascent for Finite-Sum Minimax Problems [26.676582181833584]
Minimax problems have attracted significant attention in recent years due to their widespread application in numerous machine learning models. We developed a novel decentralized distributed gradient descent for ascent-sum minimax problem. Our work is first one to achieve such theoretical complexities for this kind minimax problem.
arXiv Detail & Related papers (2022-12-06T03:25:44Z)
Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems. It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity [109.54166127479093]
Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving machine learning problems. In this paper, we propose a class faster faster zerothorder alternating gradient method multipliers (MMADMM) to solve the non finitesum problems. We show that ZOMMAD methods can achieve a lower function $O(frac13nfrac1)$ for finding an $epsilon$-stationary point. At the same time, we propose a class faster zerothorder online ADM methods (M
arXiv Detail & Related papers (2019-07-30T02:21:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.