On Riemannian Gradient-Based Methods for Minimax Problems
- URL: http://arxiv.org/abs/2010.06097v4
- Date: Mon, 25 Apr 2022 21:33:03 GMT
- Title: On Riemannian Gradient-Based Methods for Minimax Problems
- Authors: Feihu Huang, Shangqian Gao
- Abstract summary: We propose a class of Riemanian-based methods to solve minimax problems.
We prove that our RGDA has a sample complexity of $tildeO(kappa4eps-3)$.
We also show that our Acc-RSGDA achieves a lower sample complexity on $tildeO(kappa4eps-3)$.
- Score: 24.199289678553896
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the paper, we study a class of useful minimax optimization problems on
Riemanian manifolds and propose a class of Riemanian gradient-based methods to
solve these minimax problems. Specifically, we propose a Riemannian gradient
descent ascent (RGDA) algorithm for the deterministic minimax optimization.
Moreover, we prove that our RGDA has a sample complexity of
$O(\kappa^2\epsilon^{-2})$ for finding an $\epsilon$-stationary point of the
nonconvex strongly-concave minimax problems, where $\kappa$ denotes the
condition number. At the same time, we introduce a Riemannian stochastic
gradient descent ascent (RSGDA) algorithm for the stochastic minimax
optimization. In the theoretical analysis, we prove that our RSGDA can achieve
a sample complexity of $O(\kappa^4\epsilon^{-4})$. To further reduce the sample
complexity, we propose an accelerated Riemannian stochastic gradient descent
ascent (Acc-RSGDA) algorithm based on the variance-reduced technique. We prove
that our Acc-RSGDA algorithm achieves a lower sample complexity of
$\tilde{O}(\kappa^{4}\epsilon^{-3})$. Extensive experimental results on the
robust distributional optimization and Deep Neural Networks (DNNs) training
over Stiefel manifold demonstrate efficiency of our algorithms.
Related papers
- Solving a Class of Non-Convex Minimax Optimization in Federated Learning [84.98927714326908]
The minimax problems arise throughout machine learning applications, ranging from machine learning training to large-scale learning.
We propose a class of algorithms for non minimax problems (emphi) that reduce complexity to $varepsilon-6)$.
We prove that FedSGDA-M has the best sample complexity of $O(kappa2-3)$ and the best-known communication of $O(kappa2-3)$.
arXiv Detail & Related papers (2023-10-05T15:48:41Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex
Minimax Machine Learning [12.069630105460766]
An Alternating Table-descentascent (AltGDA) is an computation optimization algorithm that has been widely used for training in various machine learning applications.
In this paper, we develop a single-loop fast computation-of-the-loop gradient-of-the-loop algorithm to solve non minimax optimization problems.
arXiv Detail & Related papers (2021-12-22T04:33:27Z) - AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax
Optimization [104.96004056928474]
We propose a class of faster adaptive gradient descent methods for non-strongly-concave minimax problems.
We show that our method reaches a lower sample complexity of $O(kappa2.5epsilon-3)$ with the mini-batch size $O(kappa)$.
arXiv Detail & Related papers (2021-06-30T14:47:09Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.