Generalized Optimistic Methods for Convex-Concave Saddle Point Problems
- URL: http://arxiv.org/abs/2202.09674v2
- Date: Wed, 10 Jan 2024 05:05:20 GMT
- Title: Generalized Optimistic Methods for Convex-Concave Saddle Point Problems
- Authors: Ruichen Jiang, Aryan Mokhtari
- Abstract summary: The optimistic method has seen increasing popularity for solving convex-concave saddle point problems.
We develop a backtracking line search scheme to select the step sizes without knowledge of coefficients.
- Score: 24.5327016306566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The optimistic gradient method has seen increasing popularity for solving
convex-concave saddle point problems. To analyze its iteration complexity, a
recent work [arXiv:1906.01115] proposed an interesting perspective that
interprets this method as an approximation to the proximal point method. In
this paper, we follow this approach and distill the underlying idea of optimism
to propose a generalized optimistic method, which includes the optimistic
gradient method as a special case. Our general framework can handle constrained
saddle point problems with composite objective functions and can work with
arbitrary norms using Bregman distances. Moreover, we develop a backtracking
line search scheme to select the step sizes without knowledge of the smoothness
coefficients. We instantiate our method with first-, second- and higher-order
oracles and give best-known global iteration complexity bounds. For our
first-order method, we show that the averaged iterates converge at a rate of
$O(1/N)$ when the objective function is convex-concave, and it achieves linear
convergence when the objective is strongly-convex-strongly-concave. For our
second- and higher-order methods, under the additional assumption that the
distance-generating function has Lipschitz gradient, we prove a complexity
bound of $O(1/\epsilon^\frac{2}{p+1})$ in the convex-concave setting and a
complexity bound of
$O((L_pD^\frac{p-1}{2}/\mu)^\frac{2}{p+1}+\log\log\frac{1}{\epsilon})$ in the
strongly-convex-strongly-concave setting, where $L_p$ ($p\geq 2$) is the
Lipschitz constant of the $p$-th-order derivative, $\mu$ is the strong
convexity parameter, and $D$ is the initial Bregman distance to the saddle
point. Moreover, our line search scheme provably only requires a constant
number of calls to a subproblem solver per iteration on average, making our
first- and second-order methods particularly amenable to implementation.
Related papers
- Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity [24.45688490844496]
Subgradient method is one of the most fundamental algorithmic schemes for nonsmooth optimization.
In this work, we first extend the typical iteration complexity results for the subgradient method to cover non-Lipschitz convex and weakly convex minimization.
arXiv Detail & Related papers (2023-05-23T15:26:36Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Accelerated first-order methods for convex optimization with locally
Lipschitz continuous gradient [0.0]
We first consider unconstrained convex optimization with Lipschitz continuous gradient (LLCG) and propose accelerated proximal gradient (APG) methods for solving it.
The proposed APG methods are equipped with a verifiable termination criterion and enjoy an operation complexity of $cal O(varepsilon-1/2log varepsilon-1)$.
Preliminary numerical results are presented to demonstrate the performance of our proposed methods.
arXiv Detail & Related papers (2022-06-02T10:34:26Z) - A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth.
We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z) - Randomized Bregman Coordinate Descent Methods for Non-Lipschitz
Optimization [31.474280642125734]
A new textitrandomized Bregman (block) coordinate descent (CD) method is proposed.
We show that the proposed method is $O(epsilon-2n)$ to achieve $epsilon-stationary point, where $n$ is the number of blocks of coordinates.
arXiv Detail & Related papers (2020-01-15T09:57:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.