Exploiting hidden structures in non-convex games for convergence to Nash
equilibrium
- URL: http://arxiv.org/abs/2312.16609v1
- Date: Wed, 27 Dec 2023 15:21:25 GMT
- Title: Exploiting hidden structures in non-convex games for convergence to Nash
equilibrium
- Authors: Iosif Sakos and Emmanouil-Vasileios Vlatakis-Gkaragkounis and
Panayotis Mertikopoulos and Georgios Piliouras
- Abstract summary: A wide array of modern machine learning applications can be formulated as non-cooperative Nashlibria.
We provide explicit convergence guarantees for both deterministic and deterministic environments.
- Score: 62.88214569402201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A wide array of modern machine learning applications - from adversarial
models to multi-agent reinforcement learning - can be formulated as
non-cooperative games whose Nash equilibria represent the system's desired
operational states. Despite having a highly non-convex loss landscape, many
cases of interest possess a latent convex structure that could potentially be
leveraged to yield convergence to equilibrium. Driven by this observation, our
paper proposes a flexible first-order method that successfully exploits such
"hidden structures" and achieves convergence under minimal assumptions for the
transformation connecting the players' control variables to the game's latent,
convex-structured layer. The proposed method - which we call preconditioned
hidden gradient descent (PHGD) - hinges on a judiciously chosen gradient
preconditioning scheme related to natural gradient methods. Importantly, we
make no separability assumptions for the game's hidden structure, and we
provide explicit convergence rate guarantees for both deterministic and
stochastic environments.
Related papers
- Independent Learning in Constrained Markov Potential Games [19.083595175045073]
Constrained Markov games offer a formal framework for modeling multi-agent reinforcement learning problems.
We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria.
arXiv Detail & Related papers (2024-02-27T20:57:35Z) - Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level
Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling.
We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent (GD) is a powerful workhorse of modern machine learning.
GD's ability to find local minimisers is only guaranteed for losses with Lipschitz gradients.
This work focuses on simple, yet representative, learning problems via analysis of two-step gradient updates.
arXiv Detail & Related papers (2022-06-08T21:32:50Z) - A unified stochastic approximation framework for learning in games [82.74514886461257]
We develop a flexible approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite)
The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, exponential/multiplicative weights for learning in finite games, optimistic and bandit variants of the above, etc.
arXiv Detail & Related papers (2022-06-08T14:30:38Z) - Rockafellian Relaxation and Stochastic Optimization under Perturbations [0.056247917037481096]
We develop an optimistic" framework based on Rockafellian relaxations in which optimization is conducted not only over the original decision space but also jointly with a choice of model.
The framework centers on the novel concepts of exact and limit-exact Rockafellians, with interpretations of negative'' regularization emerging in certain settings.
arXiv Detail & Related papers (2022-04-10T20:02:41Z) - Learning Game-Theoretic Models of Multiagent Trajectories Using Implicit
Layers [9.594432031144716]
We propose an end-to-end trainable architecture that hybridizes neural nets with game-theoretic reasoning.
For tractability, we introduce a new class of continuous potential games and an equilibrium-separating partition of the action space.
We evaluate our approach on two real-world data sets, where we predict highway merging driver trajectories, and on a simple decision-making transfer task.
arXiv Detail & Related papers (2020-08-17T13:34:12Z) - Smoothed Geometry for Robust Attribution [36.616902063693104]
Feature attributions are a popular tool for explaining behavior of Deep Neural Networks (DNNs)
They have been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs.
This lack of robustness is especially problematic in high-stakes applications where adversarially-manipulated explanations could impair safety and trustworthiness.
arXiv Detail & Related papers (2020-06-11T17:35:13Z) - On dissipative symplectic integration with applications to
gradient-based optimization [77.34726150561087]
We propose a geometric framework in which discretizations can be realized systematically.
We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
arXiv Detail & Related papers (2020-04-15T00:36:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.