Newton-type Methods for Minimax Optimization
- URL: http://arxiv.org/abs/2006.14592v2
- Date: Thu, 11 Feb 2021 01:54:34 GMT
- Title: Newton-type Methods for Minimax Optimization
- Authors: Guojun Zhang, Kaiwen Wu, Pascal Poupart and Yaoliang Yu
- Abstract summary: We propose two novel Newtontype algorithms for non-nonconcave minimax learning.
We prove their convergence at strict minimax points, which are sequentials solutions.
- Score: 37.58722381375258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential games, in particular two-player sequential zero-sum games
(a.k.a. minimax optimization), have been an important modeling tool in applied
science and received renewed interest in machine learning due to many recent
applications, such as adversarial training, generative models and reinforcement
learning. However, existing theory mostly focuses on convex-concave functions
with few exceptions. In this work, we propose two novel Newton-type algorithms
for nonconvex-nonconcave minimax optimization. We prove their local convergence
at strict local minimax points, which are surrogates of global solutions. We
argue that our Newton-type algorithms nicely complement existing ones in that
(a) they converge faster to strict local minimax points; (b) they are much more
effective when the problem is ill-conditioned; (c) their computational
complexity remains similar. We verify the effectiveness of our Newton-type
algorithms through experiments on training GANs which are intrinsically
nonconvex and ill-conditioned.
Related papers
- Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization [20.093236438944718]
We develop novel gradient-based methods for tackling non-linear minimax problems.
We show that the new methods achieve comparable results with other methods.
arXiv Detail & Related papers (2024-10-29T17:47:22Z) - Unified Convergence Theory of Stochastic and Variance-Reduced Cubic Newton Methods [37.1630298053787]
We propose a new framework, which we call the helper framework.
It provides a unified view of the variance and second-order algorithms equipped with global complexity guarantees.
arXiv Detail & Related papers (2023-02-23T12:18:28Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - Minimax Optimization: The Case of Convex-Submodular [50.03984152441271]
Minimax problems extend beyond the continuous domain to mixed continuous-discrete domains or even fully discrete domains.
We introduce the class of convex-submodular minimax problems, where the objective is convex with respect to the continuous variable and submodular with respect to the discrete variable.
Our proposed algorithms are iterative and combine tools from both discrete and continuous optimization.
arXiv Detail & Related papers (2021-11-01T21:06:35Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Why Do Local Methods Solve Nonconvex Problems? [54.284687261929115]
Non-used optimization is ubiquitous in modern machine learning.
We rigorously formalize it for instances of machine learning problems.
We hypothesize a unified explanation for this phenomenon.
arXiv Detail & Related papers (2021-03-24T19:34:11Z) - Hybrid Variance-Reduced SGD Algorithms For Nonconvex-Concave Minimax
Problems [26.24895953952318]
We develop an algorithm to solve a class of non-gence minimax problems.
They can also work with both single or two mini-batch derivatives.
arXiv Detail & Related papers (2020-06-27T03:05:18Z) - On Newton Screening [14.040371216692645]
We develop a new screening method called Newton screening (NS) which is a generalized Newton method with a built-in screening mechanism.
We show that NS possesses an optimal convergence property in the sense that it achieves one-step local convergence.
arXiv Detail & Related papers (2020-01-27T11:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.