Related papers: Min-Max Optimization under Delays

Min-Max Optimization under Delays

URL: http://arxiv.org/abs/2307.06886v3
Date: Fri, 25 Aug 2023 03:14:32 GMT
Title: Min-Max Optimization under Delays
Authors: Arman Adibi, Aritra Mitra, and Hamed Hassani
Abstract summary: Delays and asynchrony are inevitable in large-scale machine-learning problems. No analogous theory is available for min-max optimization. We show that even small delays can cause prominent algorithms like Extra-gradient to diverge.
Score: 26.830212508878162
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Delays and asynchrony are inevitable in large-scale machine-learning problems where communication plays a key role. As such, several works have extensively analyzed stochastic optimization with delayed gradients. However, as far as we are aware, no analogous theory is available for min-max optimization, a topic that has gained recent popularity due to applications in adversarial robustness, game theory, and reinforcement learning. Motivated by this gap, we examine the performance of standard min-max optimization algorithms with delayed gradient updates. First, we show (empirically) that even small delays can cause prominent algorithms like Extra-gradient (\texttt{EG}) to diverge on simple instances for which \texttt{EG} guarantees convergence in the absence of delays. Our empirical study thus suggests the need for a careful analysis of delayed versions of min-max optimization algorithms. Accordingly, under suitable technical assumptions, we prove that Gradient Descent-Ascent (\texttt{GDA}) and \texttt{EG} with delayed updates continue to guarantee convergence to saddle points for convex-concave and strongly convex-strongly concave settings. Our complexity bounds reveal, in a transparent manner, the slow-down in convergence caused by delays.

Related papers

Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations [33.38543010618118]
Zeroth-order (ZO) optimization has emerged as a promising alternative to gradient-based backpropagation methods. We show that high dimensionality is the primary bottleneck and introduce the notions of textit and textiteffective perturbations to explain how structured perturbations reduce gradient noise and accelerate convergence.
arXiv Detail & Related papers (2025-01-31T12:46:04Z)
Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z)
Ordering for Non-Replacement SGD [7.11967773739707]
We seek to find an ordering that can improve the convergence rates for the non-replacement form of the algorithm. We develop optimal orderings for constant and decreasing step sizes for strongly convex and convex functions. In addition, we are able to combine the ordering with mini-batch and further apply it to more complex neural networks.
arXiv Detail & Related papers (2023-06-28T00:46:58Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion [9.551565016483833]
We study comonotone min-max optimization, a structured class of non-nonconcave min-max optimization problems. In our first contribution, we extend the Extra Anchored Gradient (EAG) algorithm to constrained comonotone min-max optimization. In our second contribution, we extend the Fast Extra Gradient (FEG) algorithm to constrainednotone min-max optimization.
arXiv Detail & Related papers (2022-06-10T17:44:06Z)
Convergence Rates of Two-Time-Scale Gradient Descent-Ascent Dynamics for Solving Nonconvex Min-Max Problems [2.0305676256390934]
We characterize the finite-time performance of the continuous-time variant of simultaneous gradient descent-ascent algorithm. Our results on the behavior of continuous-time algorithm may be used to enhance the convergence properties of its discrete-time counterpart.
arXiv Detail & Related papers (2021-12-17T15:51:04Z)
Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD) We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z)
Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization [61.66927778694731]
Epoch gradient descent method (a.k.a. Epoch-GD) proposed by Hazan and Kale (2011) was deemed a breakthrough for strongly convex minimization. Our result is the first one that shows Epoch-GDA can achieve the optimal rate of $O (1/T)$ for the duality gap of SCSC min-max problems.
arXiv Detail & Related papers (2020-02-13T02:16:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.