Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for
Smooth Minimax Games
- URL: http://arxiv.org/abs/2006.07541v1
- Date: Sat, 13 Jun 2020 02:55:41 GMT
- Title: Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for
Smooth Minimax Games
- Authors: Arun Sai Suggala, Praneeth Netrapalli
- Abstract summary: We consider the problem of online learning and its application to solving minimax games.
For the online learning problem, Follow Perturbed Leader is a widely perturbed oracle setting which computes the best response.
- Score: 33.9383996530254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of online learning and its application to solving
minimax games. For the online learning problem, Follow the Perturbed Leader
(FTPL) is a widely studied algorithm which enjoys the optimal $O(T^{1/2})$
worst-case regret guarantee for both convex and nonconvex losses. In this work,
we show that when the sequence of loss functions is predictable, a simple
modification of FTPL which incorporates optimism can achieve better regret
guarantees, while retaining the optimal worst-case regret guarantee for
unpredictable sequences. A key challenge in obtaining these tighter regret
bounds is the stochasticity and optimism in the algorithm, which requires
different analysis techniques than those commonly used in the analysis of FTPL.
The key ingredient we utilize in our analysis is the dual view of perturbation
as regularization. While our algorithm has several applications, we consider
the specific application of minimax games. For solving smooth convex-concave
games, our algorithm only requires access to a linear optimization oracle. For
Lipschitz and smooth nonconvex-nonconcave games, our algorithm requires access
to an optimization oracle which computes the perturbed best response. In both
these settings, our algorithm solves the game up to an accuracy of
$O(T^{-1/2})$ using $T$ calls to the optimization oracle. An important feature
of our algorithm is that it is highly parallelizable and requires only
$O(T^{1/2})$ iterations, with each iteration making $O(T^{1/2})$ parallel calls
to the optimization oracle.
Related papers
- An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization [37.300102993926046]
We study the complexity of producing neither smooth nor convex points of Lipschitz objectives which are possibly using only zero-order evaluations.
Our analysis is based on a simple yet powerful.
Goldstein-subdifferential set, which allows recent advancements in.
nonsmooth non optimization.
arXiv Detail & Related papers (2023-07-10T11:56:04Z) - Doubly Optimal No-Regret Learning in Monotone Games [10.760195409078591]
We propose the first doubly optimal no-regret learning algorithm for smooth monotone games.
Our algorithm achieves both (i) the optimal $O(sqrtT)$ regret in the adversarial setting under smooth and convex loss functions and (ii) the optimal $O(frac1T)$ last-iterate convergence rate to a Nash equilibrium.
arXiv Detail & Related papers (2023-01-30T17:55:53Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Almost Optimal Algorithms for Two-player Markov Games with Linear
Function Approximation [92.99933928528797]
We study reinforcement learning for two-player zero-sum Markov games with simultaneous moves.
We propose an algorithm Nash-UCRL-VTR based on the principle "Optimism-in-Face-of-Uncertainty"
We show that Nash-UCRL-VTR can provably achieve an $tildeO(dHsqrtT)$ regret, where $d$ is the linear function dimension.
arXiv Detail & Related papers (2021-02-15T09:09:16Z) - Quantum Algorithm for Online Convex Optimization [4.702729080310267]
We explore whether quantum advantages can be found for the zeroth-order online convex optimization problem.
In this paper, we present quantum algorithms for the problem and show for the first time that potential quantum advantages are possible.
arXiv Detail & Related papers (2020-07-29T18:34:05Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.