Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee
- URL: http://arxiv.org/abs/2210.12860v4
- Date: Tue, 23 Apr 2024 15:40:29 GMT
- Title: Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee
- Authors: Tianyi Lin, Panayotis Mertikopoulos, Michael I. Jordan,
- Abstract summary: We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
- Score: 86.05440220344755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose and analyze several inexact regularized Newton-type methods for finding a global saddle point of \emph{convex-concave} unconstrained min-max optimization problems. Compared to first-order methods, our understanding of second-order methods for min-max optimization is relatively limited, as obtaining global rates of convergence with second-order information is much more involved. In this paper, we examine how second-order information can be used to speed up extra-gradient methods, even under inexactness. Specifically, we show that the proposed methods generate iterates that remain within a bounded set and that the averaged iterates converge to an $\epsilon$-saddle point within $O(\epsilon^{-2/3})$ iterations in terms of a restricted gap function. This matched the theoretically established lower bound in this context. We also provide a simple routine for solving the subproblem at each iteration, requiring a single Schur decomposition and $O(\log\log(1/\epsilon))$ calls to a linear system solver in a quasi-upper-triangular system. Thus, our method improves the existing line-search-based second-order min-max optimization methods by shaving off an $O(\log\log(1/\epsilon))$ factor in the required number of Schur decompositions. Finally, we present numerical experiments on synthetic and real data that demonstrate the efficiency of the proposed methods.
Related papers
- A Fully First-Order Method for Stochastic Bilevel Optimization [8.663726907303303]
We consider unconstrained bilevel optimization problems when only the first-order gradient oracles are available.
We propose a Fully First-order Approximation (F2SA) method, and study its non-asymptotic convergence properties.
We demonstrate even superior practical performance of the proposed method over existing second-order based approaches on MNIST data-hypercleaning experiments.
arXiv Detail & Related papers (2023-01-26T05:34:21Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - DRSOM: A Dimension Reduced Second-Order Method [13.778619250890406]
Under a trust-like framework, our method preserves the convergence of the second-order method while using only information in a few directions.
Theoretically, we show that the method has a local convergence and a global convergence rate of $O(epsilon-3/2)$ to satisfy the first-order and second-order conditions.
arXiv Detail & Related papers (2022-07-30T13:05:01Z) - A Newton-CG based barrier method for finding a second-order stationary
point of nonconvex conic optimization with complexity guarantees [0.5922488908114022]
We consider finding an approximate second-order stationary point (SOSP) that minimizes a twice differentiable function intersection.
Our method is not only iteration, but also achieves $mincal O(epsilon/2) which matches the best known complexity.
arXiv Detail & Related papers (2022-07-12T17:21:45Z) - Generalized Optimistic Methods for Convex-Concave Saddle Point Problems [24.5327016306566]
The optimistic method has seen increasing popularity for solving convex-concave saddle point problems.
We develop a backtracking line search scheme to select the step sizes without knowledge of coefficients.
arXiv Detail & Related papers (2022-02-19T20:31:05Z) - Adaptive extra-gradient methods for min-max optimization and games [35.02879452114223]
We present a new family of minmax optimization algorithms that automatically exploit the geometry of the gradient data observed at earlier iterations.
Thanks to this adaptation mechanism, the proposed method automatically detects whether the problem is smooth or not.
It converges to an $varepsilon$-optimal solution within $mathcalO (1/varepsilon)$ iterations in smooth problems, and within $mathcalO (1/varepsilon)$ iterations in non-smooth ones.
arXiv Detail & Related papers (2020-10-22T22:54:54Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Second-order Conditional Gradient Sliding [79.66739383117232]
We present the emphSecond-Order Conditional Gradient Sliding (SOCGS) algorithm.
The SOCGS algorithm converges quadratically in primal gap after a finite number of linearly convergent iterations.
It is useful when the feasible region can only be accessed efficiently through a linear optimization oracle.
arXiv Detail & Related papers (2020-02-20T17:52:18Z) - Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query
Complexity [109.54166127479093]
Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving machine learning problems.
In this paper, we propose a class faster faster zerothorder alternating gradient method multipliers (MMADMM) to solve the non finitesum problems.
We show that ZOMMAD methods can achieve a lower function $O(frac13nfrac1)$ for finding an $epsilon$-stationary point.
At the same time, we propose a class faster zerothorder online ADM methods (M
arXiv Detail & Related papers (2019-07-30T02:21:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.