Related papers: Zeroth-Order Algorithms for Smooth Saddle-Point Problems

Zeroth-Order Algorithms for Smooth Saddle-Point Problems

URL: http://arxiv.org/abs/2009.09908v2
Date: Sat, 27 Feb 2021 19:13:00 GMT
Title: Zeroth-Order Algorithms for Smooth Saddle-Point Problems
Authors: Abdurakhmon Sadiev, Aleksandr Beznosikov, Pavel Dvurechensky, Alexander Gasnikov
Abstract summary: We propose several algorithms to solve saddle-point problems using zeroth-order oracles. Our analysis shows that our convergence rate for the term is only by a $log n$ factor worse than for the first-order methods. We also consider a mixed setup and develop 1/2th-order methods that use zeroth-order oracle for the part.
Score: 117.44028458220427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Saddle-point problems have recently gained increased attention from the machine learning community, mainly due to applications in training Generative Adversarial Networks using stochastic gradients. At the same time, in some applications only a zeroth-order oracle is available. In this paper, we propose several algorithms to solve stochastic smooth (strongly) convex-concave saddle-point problems using zeroth-order oracles and estimate their convergence rate and its dependence on the dimension $n$ of the variable. In particular, our analysis shows that in the case when the feasible set is a direct product of two simplices, our convergence rate for the stochastic term is only by a $\log n$ factor worse than for the first-order methods. We also consider a mixed setup and develop 1/2th-order methods that use zeroth-order oracle for the minimization part and first-order oracle for the maximization part. Finally, we demonstrate the practical performance of our zeroth-order and 1/2th-order methods on practical problems.

Related papers

Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
Strictly Low Rank Constraint Optimization -- An Asymptotically $\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution. We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z)
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods. Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z)
Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods [57.050204432302195]
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(sigma / sqrtT)$ convergence when the oracle feedback is with variance $sigma2$, and improves its convergence to $O( 1 / T3)$ with deterministic oracles.
arXiv Detail & Related papers (2022-11-03T14:12:51Z)
Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems. We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z)
Zeroth-Order Negative Curvature Finding: Escaping Saddle Points without Gradients [22.153544816232042]
We consider escaping saddle points of non local problems where only the function evaluations can be accessed. We propose two zeroth-order negative curvature finding frameworks.
arXiv Detail & Related papers (2022-10-04T10:01:16Z)
A Projection-free Algorithm for Constrained Stochastic Multi-level Composition Optimization [12.096252285460814]
We propose a projection-free conditional gradient-type algorithm for composition optimization. We show that the number of oracles and the linear-minimization oracle required by the proposed algorithm, are of order $mathcalO_T(epsilon-2)$ and $mathcalO_T(epsilon-3)$ respectively.
arXiv Detail & Related papers (2022-02-09T06:05:38Z)
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems. MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)
Gradient-Free Methods for Saddle-Point Problem [125.99533416395765]
We generalize the approach Gasnikov et. al., 2017, which allows to solve (stochastic) convex optimization problems with an inexact gradient-free oracle. Our approach reduces $fracnlog n$ times the required number of oracle calls. In the second part of the paper, we analyze the case when such an assumption cannot be made, we propose a general approach on how to modernize the method to solve this problem.
arXiv Detail & Related papers (2020-05-12T16:44:27Z)
Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities [21.13934071954103]
We present a deterministic algorithm for non-in one-text variable Descent strongly-concave in the other. We show that under the SGC assumption, the complexities of the algorithms match that of existing algorithms. Results are presented in terms of oracle-texttZO-GDMSA and Numerical experiments are presented to support theoretical results.
arXiv Detail & Related papers (2020-01-22T00:05:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.