Greedy Adversarial Equilibrium: An Efficient Alternative to
Nonconvex-Nonconcave Min-Max Optimization
- URL: http://arxiv.org/abs/2006.12363v4
- Date: Tue, 4 May 2021 16:41:54 GMT
- Title: Greedy Adversarial Equilibrium: An Efficient Alternative to
Nonconvex-Nonconcave Min-Max Optimization
- Authors: Oren Mangoubi and Nisheeth K. Vishnoi
- Abstract summary: We show that Lipschitzitz's $varepsilon$-greedy adversarial model converges from any starting point to a $max_z f(x, z)$.
We also show that Lipschitz's $nabla_y f(x,y)$ is in the dimension $d$, $1/varepsilon$, and the bounds on $nabla2_y f(x,y)$ are $nabla2_y.
- Score: 28.431572772564518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Min-max optimization of an objective function $f: \mathbb{R}^d \times
\mathbb{R}^d \rightarrow \mathbb{R}$ is an important model for robustness in an
adversarial setting, with applications to many areas including optimization,
economics, and deep learning. In many applications $f$ may be
nonconvex-nonconcave, and finding a global min-max point may be computationally
intractable. There is a long line of work that seeks computationally tractable
algorithms for alternatives to the min-max optimization model. However, many of
the alternative models have solution points which are only guaranteed to exist
under strong assumptions on $f$, such as convexity, monotonicity, or special
properties of the starting point. We propose an optimization model, the
$\varepsilon$-greedy adversarial equilibrium, and show that it can serve as a
computationally tractable alternative to the min-max optimization model.
Roughly, we say that a point $(x^\star, y^\star)$ is an $\varepsilon$-greedy
adversarial equilibrium if $y^\star$ is an $\varepsilon$-approximate local
maximum for $f(x^\star,\cdot)$, and $x^\star$ is an $\varepsilon$-approximate
local minimum for a "greedy approximation" to the function $\max_z f(x, z)$
which can be efficiently estimated using second-order optimization algorithms.
We prove the existence of such a point for any smooth function which is bounded
and has Lipschitz Hessian. To prove existence, we introduce an algorithm that
converges from any starting point to an $\varepsilon$-greedy adversarial
equilibrium in a number of evaluations of the function $f$, the max-player's
gradient $\nabla_y f(x,y)$, and its Hessian $\nabla^2_y f(x,y)$, that is
polynomial in the dimension $d$, $1/\varepsilon$, and the bounds on $f$ and its
Lipschitz constant.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - An Optimal Algorithm for Strongly Convex Min-min Optimization [79.11017157526815]
Existing optimal first-order methods require $mathcalO(sqrtmaxkappa_x,kappa_y log 1/epsilon)$ of computations of both $nabla_x f(x,y)$ and $nabla_y f(x,y)$.
We propose a new algorithm that only requires $mathcalO(sqrtkappa_x log 1/epsilon)$ of computations of $nabla_x f(x,
arXiv Detail & Related papers (2022-12-29T19:26:12Z) - An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum
Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z) - Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization
Domain [11.562923882714093]
We study the problem of finding approximate first-order stationary points in optimization problems of the form $min_x in max_y in Y f(x,y)
Our approach relies upon replacing the function $f(x,cdot)$ with its $kth order Taylor approximation (in $y$) and finding a near-stationary point in $Y$.
arXiv Detail & Related papers (2021-10-08T07:46:18Z) - Saddle Point Optimization with Approximate Minimization Oracle [8.680676599607125]
A major approach to saddle point optimization $min_xmax_y f(x, y)$ is a gradient based approach as is popularized by generative adversarial networks (GANs)
In contrast, we analyze an alternative approach relying only on an oracle that solves a minimization problem approximately.
Our approach locates approximate solutions $x'$ and $y'$ to $min_x'f(x', y)$ at a given point $(x, y)$ and updates $(x, y)$ toward these approximate solutions $(x', y'
arXiv Detail & Related papers (2021-03-29T23:03:24Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z) - The Complexity of Constrained Min-Max Optimization [29.57458485068705]
We show that an approximate local point large enough min-max is guaranteed to exist.
More importantly, we show an approximate fixed gradient Descent/Ascent approximation complete.
Our result is the first to show an exponential approximation of two fundamental optimization problems.
arXiv Detail & Related papers (2020-09-21T05:54:12Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z) - Stochastic Recursive Gradient Descent Ascent for Stochastic
Nonconvex-Strongly-Concave Minimax Problems [36.645753881826955]
In this paper, we propose a novel method called RecurEnti Ascent (SREDA), which estimates more efficiently using variance.
This method achieves the best known for this problem.
arXiv Detail & Related papers (2020-01-11T09:05:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.