The Complexity of Constrained Min-Max Optimization
- URL: http://arxiv.org/abs/2009.09623v1
- Date: Mon, 21 Sep 2020 05:54:12 GMT
- Title: The Complexity of Constrained Min-Max Optimization
- Authors: Constantinos Daskalakis and Stratis Skoulakis and Manolis Zampetakis
- Abstract summary: We show that an approximate local point large enough min-max is guaranteed to exist.
More importantly, we show an approximate fixed gradient Descent/Ascent approximation complete.
Our result is the first to show an exponential approximation of two fundamental optimization problems.
- Score: 29.57458485068705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite its important applications in Machine Learning, min-max optimization
of nonconvex-nonconcave objectives remains elusive. Not only are there no known
first-order methods converging even to approximate local min-max points, but
the computational complexity of identifying them is also poorly understood. In
this paper, we provide a characterization of the computational complexity of
the problem, as well as of the limitations of first-order methods in
constrained min-max optimization problems with nonconvex-nonconcave objectives
and linear constraints.
As a warm-up, we show that, even when the objective is a Lipschitz and smooth
differentiable function, deciding whether a min-max point exists, in fact even
deciding whether an approximate min-max point exists, is NP-hard. More
importantly, we show that an approximate local min-max point of large enough
approximation is guaranteed to exist, but finding one such point is
PPAD-complete. The same is true of computing an approximate fixed point of
Gradient Descent/Ascent.
An important byproduct of our proof is to establish an unconditional hardness
result in the Nemirovsky-Yudin model. We show that, given oracle access to some
function $f : P \to [-1, 1]$ and its gradient $\nabla f$, where $P \subseteq
[0, 1]^d$ is a known convex polytope, every algorithm that finds a
$\varepsilon$-approximate local min-max point needs to make a number of queries
that is exponential in at least one of $1/\varepsilon$, $L$, $G$, or $d$, where
$L$ and $G$ are respectively the smoothness and Lipschitzness of $f$ and $d$ is
the dimension. This comes in sharp contrast to minimization problems, where
finding approximate local minima in the same setting can be done with Projected
Gradient Descent using $O(L/\varepsilon)$ many queries. Our result is the first
to show an exponential separation between these two fundamental optimization
problems.
Related papers
- Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.
Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions [41.43895948769255]
We show a class of non-smooth non-asymptotic fairness problems in the form of $min_x[yin Yphi(x, y) - max_zin Zpsix(x, z)]$.
We propose an envelope approximate gradient SMAG, the first method for solving these problems, provide a state-of-the-art non-asymptotic convergence rate.
arXiv Detail & Related papers (2024-05-28T20:52:46Z) - The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem.
We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax
Optimization [47.27237492375659]
We study the bilinearly coupled minimax problem: $min_x max_y f(x) + ytop A x - h(y)$, where $f$ and $h$ are both strongly convex smooth functions.
No known first-order algorithms have hitherto achieved the lower complexity bound of $Omega(sqrtfracL_xmu_x + frac|A|sqrtmu_x,mu_y) log(frac1vareps
arXiv Detail & Related papers (2022-01-19T05:56:19Z) - Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization
Domain [11.562923882714093]
We study the problem of finding approximate first-order stationary points in optimization problems of the form $min_x in max_y in Y f(x,y)
Our approach relies upon replacing the function $f(x,cdot)$ with its $kth order Taylor approximation (in $y$) and finding a near-stationary point in $Y$.
arXiv Detail & Related papers (2021-10-08T07:46:18Z) - Saddle Point Optimization with Approximate Minimization Oracle [8.680676599607125]
A major approach to saddle point optimization $min_xmax_y f(x, y)$ is a gradient based approach as is popularized by generative adversarial networks (GANs)
In contrast, we analyze an alternative approach relying only on an oracle that solves a minimization problem approximately.
Our approach locates approximate solutions $x'$ and $y'$ to $min_x'f(x', y)$ at a given point $(x, y)$ and updates $(x, y)$ toward these approximate solutions $(x', y'
arXiv Detail & Related papers (2021-03-29T23:03:24Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Greedy Adversarial Equilibrium: An Efficient Alternative to
Nonconvex-Nonconcave Min-Max Optimization [28.431572772564518]
We show that Lipschitzitz's $varepsilon$-greedy adversarial model converges from any starting point to a $max_z f(x, z)$.
We also show that Lipschitz's $nabla_y f(x,y)$ is in the dimension $d$, $1/varepsilon$, and the bounds on $nabla2_y f(x,y)$ are $nabla2_y.
arXiv Detail & Related papers (2020-06-22T16:03:41Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.