Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved
Complexities
- URL: http://arxiv.org/abs/2001.07819v2
- Date: Mon, 4 Apr 2022 23:36:08 GMT
- Title: Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved
Complexities
- Authors: Zhongruo Wang, Krishnakumar Balasubramanian, Shiqian Ma, Meisam
Razaviyayn
- Abstract summary: We present a deterministic algorithm for non-in one-text variable Descent strongly-concave in the other.
We show that under the SGC assumption, the complexities of the algorithms match that of existing algorithms.
Results are presented in terms of oracle-texttZO-GDMSA and Numerical experiments are presented to support theoretical results.
- Score: 21.13934071954103
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study zeroth-order algorithms for minimax optimization
problems that are nonconvex in one variable and strongly-concave in the other
variable. Such minimax optimization problems have attracted significant
attention lately due to their applications in modern machine learning tasks. We
first consider a deterministic version of the problem. We design and analyze
the Zeroth-Order Gradient Descent Ascent (\texttt{ZO-GDA}) algorithm, and
provide improved results compared to existing works, in terms of oracle
complexity. We also propose the Zeroth-Order Gradient Descent Multi-Step Ascent
(\texttt{ZO-GDMSA}) algorithm that significantly improves the oracle complexity
of \texttt{ZO-GDA}. We then consider stochastic versions of \texttt{ZO-GDA} and
\texttt{ZO-GDMSA}, to handle stochastic nonconvex minimax problems. For this
case, we provide oracle complexity results under two assumptions on the
stochastic gradient: (i) the uniformly bounded variance assumption, which is
common in traditional stochastic optimization, and (ii) the Strong Growth
Condition (SGC), which has been known to be satisfied by modern
over-parametrized machine learning models. We establish that under the SGC
assumption, the complexities of the stochastic algorithms match that of
deterministic algorithms. Numerical experiments are presented to support our
theoretical results.
Related papers
- Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex
Minimax Machine Learning [12.069630105460766]
An Alternating Table-descentascent (AltGDA) is an computation optimization algorithm that has been widely used for training in various machine learning applications.
In this paper, we develop a single-loop fast computation-of-the-loop gradient-of-the-loop algorithm to solve non minimax optimization problems.
arXiv Detail & Related papers (2021-12-22T04:33:27Z) - Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel
Optimization [62.87181271021217]
We consider non-SBO problems that have many applications in machine learning.
This paper proposes fast randomized algorithms for non-SBO problems.
arXiv Detail & Related papers (2021-05-05T18:28:42Z) - Distributed Stochastic Consensus Optimization with Momentum for
Nonconvex Nonsmooth Problems [45.88640334610308]
This paper presents a new distributed optimization algorithm for non-smooth problems.
We show that the proposed algorithm can achieve an overcal communication.
Experiments are presented to illustrate the effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-10T13:12:21Z) - Stochastic Multi-level Composition Optimization Algorithms with
Level-Independent Convergence Rates [12.783783498844022]
We study smooth multi-level composition optimization problems, where the objective function is a nested composition of $T$ functions.
We show that the first algorithm, which is a generalization of citeGhaRuswan20 to the $T$ level case, can achieve a sample complexity of $mathcalO (1/epsilon$6)
This is the first time that such an online algorithm designed for the (un) multi-level setting, obtains the same sample complexity under standard assumptions.
arXiv Detail & Related papers (2020-08-24T15:57:50Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Hybrid Variance-Reduced SGD Algorithms For Nonconvex-Concave Minimax
Problems [26.24895953952318]
We develop an algorithm to solve a class of non-gence minimax problems.
They can also work with both single or two mini-batch derivatives.
arXiv Detail & Related papers (2020-06-27T03:05:18Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.