Related papers: Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

URL: http://arxiv.org/abs/2112.11663v1
Date: Wed, 22 Dec 2021 04:33:27 GMT
Title: Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning
Authors: Ziyi Chen, Shaocong Ma, Yi Zhou
Abstract summary: An Alternating Table-descentascent (AltGDA) is an computation optimization algorithm that has been widely used for training in various machine learning applications. In this paper, we develop a single-loop fast computation-of-the-loop gradient-of-the-loop algorithm to solve non minimax optimization problems.
Score: 12.069630105460766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Alternating gradient-descent-ascent (AltGDA) is an optimization algorithm that has been widely used for model training in various machine learning applications, which aim to solve a nonconvex minimax optimization problem. However, the existing studies show that it suffers from a high computation complexity in nonconvex minimax optimization. In this paper, we develop a single-loop and fast AltGDA-type algorithm that leverages proximal gradient updates and momentum acceleration to solve regularized nonconvex minimax optimization problems. By identifying the intrinsic Lyapunov function of this algorithm, we prove that it converges to a critical point of the nonconvex minimax optimization problem and achieves a computation complexity $\mathcal{O}(\kappa^{1.5}\epsilon^{-2})$, where $\epsilon$ is the desired level of accuracy and $\kappa$ is the problem's condition number. Such a computation complexity improves the state-of-the-art complexities of single-loop GDA and AltGDA algorithms (see the summary of comparison in Table 1). We demonstrate the effectiveness of our algorithm via an experiment on adversarial deep learning.

Related papers

Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems. We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z)
Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization [6.896308995955336]
Gradient-based minimax optimal algorithms have promoted the development of continuous optimization and machine learning. In this paper, we open up a new way to design and analyze gradient-based algorithms with direct applications in machine learning.
arXiv Detail & Related papers (2023-12-06T01:16:10Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems. It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z)
Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications. We propose two new algorithms for bilevel optimization. We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z)
Train simultaneously, generalize better: Stability of gradient-based minimax learners [12.691047660244331]
We show a key role in the performance of the trained minimax model under both convex concave and non-concave minimax settings. We discuss several numerical results indicating the role of optimization algorithms in the generalization of learned minimax models.
arXiv Detail & Related papers (2020-10-23T17:44:43Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities [21.13934071954103]
We present a deterministic algorithm for non-in one-text variable Descent strongly-concave in the other. We show that under the SGC assumption, the complexities of the algorithms match that of existing algorithms. Results are presented in terms of oracle-texttZO-GDMSA and Numerical experiments are presented to support theoretical results.
arXiv Detail & Related papers (2020-01-22T00:05:14Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.