Related papers: Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

URL: http://arxiv.org/abs/2302.03825v2
Date: Thu, 9 Feb 2023 03:08:23 GMT
Title: Decentralized Riemannian Algorithm for Nonconvex Minimax Problems
Authors: Xidong Wu, Zhengmian Hu and Heng Huang
Abstract summary: The minimax algorithms for neural networks have been developed to solve many problems. In this paper, we propose two types of minimax algorithms. For the setting, we propose DRSGDA and prove that our method achieves a gradient.
Score: 82.50374560598493
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold). Although many optimization algorithms for minimax problems have been developed in the Euclidean setting, it is difficult to convert them into Riemannian cases, and algorithms for nonconvex minimax problems with nonconvex constraints are even rare. On the other hand, to address the big data challenges, decentralized (serverless) training techniques have recently been emerging since they can reduce communications overhead and avoid the bottleneck problem on the server node. Nonetheless, the algorithm for decentralized Riemannian minimax problems has not been studied. In this paper, we study the distributed nonconvex-strongly-concave minimax optimization problem over the Stiefel manifold and propose both deterministic and stochastic minimax methods. The local model is non-convex strong-concave and the Steifel manifold is a non-convex set. The global function is represented as the finite sum of local functions. For the deterministic setting, we propose DRGDA and prove that our deterministic method achieves a gradient complexity of $O( \epsilon^{-2})$ under mild conditions. For the stochastic setting, we propose DRSGDA and prove that our stochastic method achieves a gradient complexity of $O(\epsilon^{-4})$. The DRGDA and DRSGDA are the first algorithms for distributed minimax optimization with nonconvex constraints with exact convergence. Extensive experimental results on the Deep Neural Networks (DNNs) training over the Stiefel manifold demonstrate the efficiency of our algorithms.

Related papers

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems. Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z)
Solving a Class of Non-Convex Minimax Optimization in Federated Learning [84.98927714326908]
The minimax problems arise throughout machine learning applications, ranging from machine learning training to large-scale learning. We propose a class of algorithms for non minimax problems (emphi) that reduce complexity to $varepsilon-6)$. We prove that FedSGDA-M has the best sample complexity of $O(kappa2-3)$ and the best-known communication of $O(kappa2-3)$.
arXiv Detail & Related papers (2023-10-05T15:48:41Z)
Can Decentralized Stochastic Minimax Optimization Algorithms Converge Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning. This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z)
Near-Optimal Decentralized Momentum Method for Nonconvex-PL Minimax Problems [39.197569803430646]
Minimax optimization plays an important role in many machine learning tasks such as adversarial networks (GANs) and adversarial training. Although recently a wide variety of optimization methods have been proposed to solve the minimax problems, most of them ignore the distributed setting.
arXiv Detail & Related papers (2023-04-21T11:38:41Z)
Decentralized Weakly Convex Optimization Over the Stiefel Manifold [28.427697270742947]
We focus on the Stiefel manifold in the decentralized setting, where a connected network of agents in $nMn log-1)$ are tested. We propose an method called the decentralized subgradient method (DRSM)$ for forcing a natural station below $nMn log-1)$.
arXiv Detail & Related papers (2023-03-31T02:56:23Z)
Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems. It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z)
On Riemannian Gradient-Based Methods for Minimax Problems [24.199289678553896]
We propose a class of Riemanian-based methods to solve minimax problems. We prove that our RGDA has a sample complexity of $tildeO(kappa4eps-3)$. We also show that our Acc-RSGDA achieves a lower sample complexity on $tildeO(kappa4eps-3)$.
arXiv Detail & Related papers (2020-10-13T00:54:00Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.