Rapid Learning in Constrained Minimax Games with Negative Momentum
- URL: http://arxiv.org/abs/2501.00533v1
- Date: Tue, 31 Dec 2024 16:32:51 GMT
- Title: Rapid Learning in Constrained Minimax Games with Negative Momentum
- Authors: Zijian Fang, Zongkai Liu, Chao Yu, Chaohao Hu,
- Abstract summary: We introduce a novel framework for momentum buffer updating, which extends the findings of negative momentum from the unconstrained setting to the constrained setting.
Experimental results on both Normal Form Games (NFGs) and Extensive Form Games (EFGs) demonstrate that our momentum techniques can significantly improve algorithm performance.
- Score: 5.086470864936883
- License:
- Abstract: In this paper, we delve into the utilization of the negative momentum technique in constrained minimax games. From an intuitive mechanical standpoint, we introduce a novel framework for momentum buffer updating, which extends the findings of negative momentum from the unconstrained setting to the constrained setting and provides a universal enhancement to the classic game-solver algorithms. Additionally, we provide theoretical guarantee of convergence for our momentum-augmented algorithms with entropy regularizer. We then extend these algorithms to their extensive-form counterparts. Experimental results on both Normal Form Games (NFGs) and Extensive Form Games (EFGs) demonstrate that our momentum techniques can significantly improve algorithm performance, surpassing both their original versions and the SOTA baselines by a large margin.
Related papers
- Accelerated regularized learning in finite N-person games [34.37854832849245]
We introduce a family of accelerated learning methods, which we call "follow the accelerated leader" (FTXL)
FTXL converges locally to strict Nash equilibria at a superlinear rate, achieving in this way an exponential speed-up over vanilla regularized learning methods.
arXiv Detail & Related papers (2024-12-29T06:09:26Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the
Gap Between Learning in Extensive-Form and Normal-Form Games [76.21916750766277]
We show that the Optimistic Multiplicative Weights Update (OMWU) algorithm can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick.
Specifically, KOMWU gives the first algorithm that guarantees at the same time last-iterate convergence.
arXiv Detail & Related papers (2022-02-01T06:28:51Z) - Temporal Induced Self-Play for Stochastic Bayesian Games [32.88124137877018]
We propose Temporal-Induced Self-Play (TISP) to find strategies with decent performances from any decision point onward.
TISP uses belief-space representation, backward induction, policy learning, and non-parametric approximation.
We prove that TISP-based algorithms can find approximate Perfect Bayesian Equilibrium in zero-sum one-sided games with finite horizon.
arXiv Detail & Related papers (2021-08-21T05:36:42Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - Complex Momentum for Learning in Games [42.081050296353574]
We generalize gradient descent with momentum for learning in differentiable games to have complex-valued momentum.
We empirically demonstrate that complex-valued momentum can improve convergence in games - like generative adversarial networks.
We also show a practical generalization to a complex-valued Adam variant, which we use to train BigGAN to better scores on CIFAR-10.
arXiv Detail & Related papers (2021-02-16T19:55:27Z) - On the Suboptimality of Negative Momentum for Minimax Optimization [9.400440302623839]
We show that negative momentum accelerates convergence of game dynamics locally, though with a suboptimal rate.
This is the first work that provides an explicit convergence rate momentum in this setting.
arXiv Detail & Related papers (2020-08-17T16:34:53Z) - Responsive Safety in Reinforcement Learning by PID Lagrangian Methods [74.49173841304474]
Lagrangian methods exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior.
We propose a novel Lagrange multiplier update method that utilizes derivatives of the constraint function.
We apply our PID Lagrangian methods in deep RL, setting a new state of the art in Safety Gym, a safe RL benchmark.
arXiv Detail & Related papers (2020-07-08T08:43:14Z) - Chaos, Extremism and Optimism: Volume Analysis of Learning in Games [55.24050445142637]
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
arXiv Detail & Related papers (2020-05-28T13:47:09Z) - Stochastic Regret Minimization in Extensive-Form Games [109.43344748069933]
Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full trees.
We develop a new framework for developing regret minimization methods.
We show extensive experiments on three games, where some variants of our methods outperform MCCFR.
arXiv Detail & Related papers (2020-02-19T23:05:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.