Related papers: Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games

Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games

URL: http://arxiv.org/abs/2505.19537v1
Date: Mon, 26 May 2025 05:55:16 GMT
Title: Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games
Authors: Yi Feng, Kaito Fujii, Stratis Skoulakis, Xiao Wang, Volkan Cevher,
Abstract summary: We present a continuous-time analysis for HB with simultaneous and alternating update schemes in min-max games.<n>We prove smaller momentum enhances algorithmic stability by enabling local convergence across a wider range of step sizes.<n>Globally, we study the implicit regularization of HB, and find smaller momentum guides algorithms trajectories towards shallower slope regions of the loss landscapes.
Score: 60.92010446485464
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Since Polyak's pioneering work, heavy ball (HB) momentum has been widely studied in minimization. However, its role in min-max games remains largely unexplored. As a key component of practical min-max algorithms like Adam, this gap limits their effectiveness. In this paper, we present a continuous-time analysis for HB with simultaneous and alternating update schemes in min-max games. Locally, we prove smaller momentum enhances algorithmic stability by enabling local convergence across a wider range of step sizes, with alternating updates generally converging faster. Globally, we study the implicit regularization of HB, and find smaller momentum guides algorithms trajectories towards shallower slope regions of the loss landscapes, with alternating updates amplifying this effect. Surprisingly, all these phenomena differ from those observed in minimization, where larger momentum yields similar effects. Our results reveal fundamental differences between HB in min-max games and minimization, and numerical experiments further validate our theoretical results.

Related papers

Both Asymptotic and Non-Asymptotic Convergence of Quasi-Hyperbolic Momentum using Increasing Batch Size [0.6906005491572401]
Momentum methods were originally introduced for their superiority to gradientbatch descent (SGD) in deterministic settings with convex functions.<n>We show that achieving convergence requires either a decaying learning rate or an increasing batch size.
arXiv Detail & Related papers (2025-06-30T06:31:30Z)
Rapid Learning in Constrained Minimax Games with Negative Momentum [5.086470864936883]
We introduce a novel framework for momentum buffer updating, which extends the findings of negative momentum from the unconstrained setting to the constrained setting.<n> Experimental results on both Normal Form Games (NFGs) and Extensive Form Games (EFGs) demonstrate that our momentum techniques can significantly improve algorithm performance.
arXiv Detail & Related papers (2024-12-31T16:32:51Z)
Can Decentralized Stochastic Minimax Optimization Algorithms Converge Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning. This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z)
Losing momentum in continuous-time stochastic optimisation [42.617042045455506]
momentum-based optimisation algorithms have become particularly widespread. In this work, we analyse a continuous-time model for gradient descent with momentum. We also train a convolutional neural network in an image classification problem.
arXiv Detail & Related papers (2022-09-08T10:46:05Z)
On the fast convergence of minibatch heavy ball momentum [6.154018226934517]
We show that heavy ball momentum retains the fast linear rate of (deterministic) heavy ball momentum on optimization problems. The algorithm we study can be interpreted as an accelerated randomized Kaczmarz algorithm with minibatching and heavy ball momentum.
arXiv Detail & Related papers (2022-06-15T14:12:45Z)
Convex-Concave Min-Max Stackelberg Games [0.0]
We introduce two first-order methods that solve a class of convex-concave min-max Stackelberg games. We show that our methods converge in time. We observe that the computation of competitive equilibria in Fisher markets also comprises a min-max Stackelberg game.
arXiv Detail & Related papers (2021-10-05T06:09:45Z)
Scaling up Mean Field Games with Online Mirror Descent [55.36153467919289]
We address scaling up equilibrium computation in Mean Field Games (MFGs) using Online Mirror Descent (OMD) We show that continuous-time OMD provably converges to a Nash equilibrium under a natural and well-motivated set of monotonicity assumptions. A thorough experimental investigation on various single and multi-population MFGs shows that OMD outperforms traditional algorithms such as Fictitious Play (FP)
arXiv Detail & Related papers (2021-02-28T21:28:36Z)
On the Suboptimality of Negative Momentum for Minimax Optimization [9.400440302623839]
We show that negative momentum accelerates convergence of game dynamics locally, though with a suboptimal rate. This is the first work that provides an explicit convergence rate momentum in this setting.
arXiv Detail & Related papers (2020-08-17T16:34:53Z)
Taming GANs with Lookahead-Minmax [63.90038365274479]
Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient. Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels.
arXiv Detail & Related papers (2020-06-25T17:13:23Z)
Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU) We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant. We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z)
Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.