Related papers: Linear Convergence in Games with Delayed Feedback via Extra Prediction

Linear Convergence in Games with Delayed Feedback via Extra Prediction

URL: http://arxiv.org/abs/2602.17486v1
Date: Thu, 19 Feb 2026 15:56:27 GMT
Title: Linear Convergence in Games with Delayed Feedback via Extra Prediction
Authors: Yuma Fujimoto, Kenshi Abe, Kaito Ariu,
Abstract summary: This paper derives the rate of linear convergence of Weighted Optimistic Gradient Descent-Ascent (WOGDA)<n>We analyze it as an approximation of the Extra Proximal Point (EPP), which is updated based on farther future rewards than the classical Proximal Point (PP)
Score: 26.93300099029726
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feedback delays are inevitable in real-world multi-agent learning. They are known to severely degrade performance, and the convergence rate under delayed feedback is still unclear, even for bilinear games. This paper derives the rate of linear convergence of Weighted Optimistic Gradient Descent-Ascent (WOGDA), which predicts future rewards with extra optimism, in unconstrained bilinear games. To analyze the algorithm, we interpret it as an approximation of the Extra Proximal Point (EPP), which is updated based on farther future rewards than the classical Proximal Point (PP). Our theorems show that standard optimism (predicting the next-step reward) achieves linear convergence to the equilibrium at a rate $\exp(-Θ(t/m^{5}))$ after $t$ iterations for delay $m$. Moreover, employing extra optimism (predicting farther future reward) tolerates a larger step size and significantly accelerates the rate to $\exp(-Θ(t/(m^{2}\log m)))$. Our experiments also show accelerated convergence driven by the extra optimism and are qualitatively consistent with our theorems. In summary, this paper validates that extra optimism is a promising countermeasure against performance degradation caused by feedback delays.

Related papers

Finite-time Convergence Analysis of Actor-Critic with Evolving Reward [33.907497292192225]
This paper provides the first finite-time convergence analysis of a single-timescale actor-critic algorithm in the presence of an evolving reward function.<n>As a secondary contribution, we introduce a novel analysis of distribution mismatch under Markovian sampling, improving the best-known rate by a factor of $log2T$ in the static-reward case.
arXiv Detail & Related papers (2025-10-14T09:45:19Z)
Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z)
Min-Max Optimization under Delays [26.830212508878162]
Delays and asynchrony are inevitable in large-scale machine-learning problems. No analogous theory is available for min-max optimization. We show that even small delays can cause prominent algorithms like Extra-gradient to diverge.
arXiv Detail & Related papers (2023-07-13T16:39:01Z)
Non-stationary Delayed Online Convex Optimization: From Full-information to Bandit Setting [71.82716109461967]
We propose an algorithm called Mild-OGD for the full-information case, where delayed gradients are available.<n>We show that the dynamic regret of Mild-OGD can be automatically bounded by $O(sqrtbardT(P_T+1))$ under the in-order assumption.<n>We also develop a bandit variant of Mild-OGD for a more challenging case with only delayed loss values.
arXiv Detail & Related papers (2023-05-20T07:54:07Z)
Asynchronous Gradient Play in Zero-Sum Multi-agent Games [25.690033495071923]
We study asynchronous gradient plays in zero-sum polymatrix games under delayed feedbacks. To the best of our knowledge, this work is the first that aims to understand asynchronous gradient play in zero-sum polymatrix games.
arXiv Detail & Related papers (2022-11-16T15:37:23Z)
Delayed Feedback in Generalised Linear Bandits Revisited [5.349852254138085]
We study the phenomenon of delayed rewards in generalised linear bandits in a theoretical manner. We show that a natural adaptation of an optimistic algorithm to the delayed feedback achieves a regret bound where the penalty for the delays is independent of the horizon.
arXiv Detail & Related papers (2022-07-21T23:35:01Z)
Lassoed Tree Boosting [53.56229983630983]
We prove that a gradient boosted tree algorithm with early stopping faster than $n-1/4$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
arXiv Detail & Related papers (2022-05-22T00:34:41Z)
Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD) We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails. We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z)
Online Strongly Convex Optimization with Unknown Delays [30.931538196386672]
We investigate the problem of online convex optimization with unknown delays. We first extend the delayed variant of OGD for strongly convex functions. We establish a better regret bound of $O(dlog T)$, where $d$ is the maximum delay.
arXiv Detail & Related papers (2021-03-21T10:16:15Z)
Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU) We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant. We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.