Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method
- URL: http://arxiv.org/abs/2010.01848v1
- Date: Mon, 5 Oct 2020 08:16:56 GMT
- Title: Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method
- Authors: Kiran Koshy Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong
Oh
- Abstract summary: We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls.
Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
- Score: 54.93433440034386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the classical setting of optimizing a nonsmooth Lipschitz
continuous convex function over a convex constraint set, when having access to
a (stochastic) first-order oracle (FO) for the function and a projection oracle
(PO) for the constraint set. It is well known that to achieve
$\epsilon$-suboptimality in high-dimensions, $\Theta(\epsilon^{-2})$ FO calls
are necessary. This is achieved by the projected subgradient method (PGD).
However, PGD also entails $O(\epsilon^{-2})$ PO calls, which may be
computationally costlier than FO calls (e.g. nuclear norm constraints).
Improving this PO calls complexity of PGD is largely unexplored, despite the
fundamental nature of this problem and extensive literature. We present first
such improvement. This only requires a mild assumption that the objective
function, when extended to a slightly larger neighborhood of the constraint
set, still remains Lipschitz and accessible via FO. In particular, we introduce
MOPES method, which carefully combines Moreau-Yosida smoothing and accelerated
first-order schemes. This is guaranteed to find a feasible
$\epsilon$-suboptimal solution using only $O(\epsilon^{-1})$ PO calls and
optimal $O(\epsilon^{-2})$ FO calls. Further, instead of a PO if we only have a
linear minimization oracle (LMO, a la Frank-Wolfe) to access the constraint
set, an extension of our method, MOLES, finds a feasible $\epsilon$-suboptimal
solution using $O(\epsilon^{-2})$ LMO calls and FO calls---both match known
lower bounds, resolving a question left open since White (1993). Our
experiments confirm that these methods achieve significant speedups over the
state-of-the-art, for a problem with costly PO and LMO calls.
Related papers
- First-Order Methods for Linearly Constrained Bilevel Optimization [38.19659447295665]
We present first-order linearly constrained optimization methods for high-level Hessian computations.
For linear inequality constraints, we attain $(delta,epsilon)$-Goldstein stationarity in $widetildeO(ddelta-1 epsilon-3)$ gradient oracle calls.
arXiv Detail & Related papers (2024-06-18T16:41:21Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Perseus: A Simple and Optimal High-Order Method for Variational
Inequalities [81.32967242727152]
A VI involves finding $xstar in mathcalX$ such that $langle F(x), x - xstarrangle geq 0$ for all $x in mathcalX$.
We propose a $pth$-order method that does textitnot require any line search procedure and provably converges to a weak solution at a rate of $O(epsilon-2/(p+1))$.
arXiv Detail & Related papers (2022-05-06T13:29:14Z) - Generalized Optimistic Methods for Convex-Concave Saddle Point Problems [24.5327016306566]
The optimistic method has seen increasing popularity for solving convex-concave saddle point problems.
We develop a backtracking line search scheme to select the step sizes without knowledge of coefficients.
arXiv Detail & Related papers (2022-02-19T20:31:05Z) - Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax
Optimization [47.27237492375659]
We study the bilinearly coupled minimax problem: $min_x max_y f(x) + ytop A x - h(y)$, where $f$ and $h$ are both strongly convex smooth functions.
No known first-order algorithms have hitherto achieved the lower complexity bound of $Omega(sqrtfracL_xmu_x + frac|A|sqrtmu_x,mu_y) log(frac1vareps
arXiv Detail & Related papers (2022-01-19T05:56:19Z) - Near-Optimal Lower Bounds For Convex Optimization For All Orders of
Smoothness [26.71898403195793]
We study the complexity of optimizing highly smooth convex functions.
For a positive integer $p$, we want to find an $epsilon$-approximate minimum of a convex function $f$.
We prove a new lower bound that matches this bound (up to log factors), and holds not only for randomized algorithms, but also for quantum algorithms.
arXiv Detail & Related papers (2021-12-02T10:51:43Z) - Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss [41.17536985461902]
We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors.
We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
arXiv Detail & Related papers (2021-05-04T21:49:15Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Safe Learning under Uncertain Objectives and Constraints [66.05180398174286]
We consider non-textitunknown yet safety-critical optimization problems under textitunknown yet safety-critical constraints.
Such problems naturally arise in a variety of domains including robotics, manufacturing, and medical procedures.
A crucial component of our analysis is to introduce and apply a technique called shrinkage in the context of safe optimization.
arXiv Detail & Related papers (2020-06-23T20:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.