Discovering How Agents Learn Using Few Data
- URL: http://arxiv.org/abs/2307.06640v1
- Date: Thu, 13 Jul 2023 09:14:48 GMT
- Title: Discovering How Agents Learn Using Few Data
- Authors: Iosif Sakos, Antonios Varvitsiotis, Georgios Piliouras
- Abstract summary: We propose a theoretical and algorithmic framework for real-time identification of agent behavior using a short burst of a single system trajectory.
Our approach accurately recovers the true dynamics across various benchmarks, including equilibrium selection and prediction of chaotic systems up to 10 Lynov times.
These findings suggest that our approach has significant potential to support effective policy and decision-making in strategic multi-agent systems.
- Score: 32.38609641970052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decentralized learning algorithms are an essential tool for designing
multi-agent systems, as they enable agents to autonomously learn from their
experience and past interactions. In this work, we propose a theoretical and
algorithmic framework for real-time identification of the learning dynamics
that govern agent behavior using a short burst of a single system trajectory.
Our method identifies agent dynamics through polynomial regression, where we
compensate for limited data by incorporating side-information constraints that
capture fundamental assumptions or expectations about agent behavior. These
constraints are enforced computationally using sum-of-squares optimization,
leading to a hierarchy of increasingly better approximations of the true agent
dynamics. Extensive experiments demonstrated that our approach, using only 5
samples from a short run of a single trajectory, accurately recovers the true
dynamics across various benchmarks, including equilibrium selection and
prediction of chaotic systems up to 10 Lyapunov times. These findings suggest
that our approach has significant potential to support effective policy and
decision-making in strategic multi-agent systems.
Related papers
- CyGATE: Game-Theoretic Cyber Attack-Defense Engine for Patch Strategy Optimization [73.13843039509386]
This paper presents CyGATE, a game-theoretic framework modeling attacker-defender interactions.<n>CyGATE frames cyber conflicts as a partially observable game (POSG) across Cyber Kill Chain stages.<n>The framework's flexible architecture enables extension to multi-agent scenarios.
arXiv Detail & Related papers (2025-08-01T09:53:06Z) - FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory [51.96049148869987]
We present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory.
We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents.
Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios.
arXiv Detail & Related papers (2025-04-19T15:29:04Z) - Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics [0.0]
It has been shown that the dynamics of agents' interactions, even in simple two-player games, are incapable of reaching Nash equilibria.
Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs.
arXiv Detail & Related papers (2025-02-20T16:50:38Z) - Scalable Offline Reinforcement Learning for Mean Field Games [6.8267158622784745]
Off-MMD is a novel mean-field RL algorithm that approximates equilibrium policies in mean-field games using purely offline data.
Our algorithm scales to complex environments and demonstrates strong performance on benchmark tasks like crowd exploration or navigation.
arXiv Detail & Related papers (2024-10-23T14:16:34Z) - Adversarial Knapsack and Secondary Effects of Common Information for Cyber Operations [0.9378911615939924]
We formalize a dynamical network control game for Capture the Flag (CTF) competitions and detail the static game for each time step.
We define the Adversarial Knapsack optimization problems as a system of interacting Weighted Knapsack problems.
Common awareness of the scenario, rewards, and costs will set the stage for a non-cooperative game.
arXiv Detail & Related papers (2024-03-16T03:41:12Z) - Blending Data-Driven Priors in Dynamic Games [9.085463548798366]
We formulate an algorithm for solving non-cooperative dynamic game with Kullback-Leibler (KL) regularization.
We propose an efficient algorithm for computing multi-modal approximate feedback Nash equilibrium strategies of KLGame in real time.
arXiv Detail & Related papers (2024-02-21T23:22:32Z) - Auto-Encoding Bayesian Inverse Games [36.06617326128679]
We consider the inverse game problem, in which some properties of the game are unknown a priori.
Existing maximum likelihood estimation approaches to solve inverse games provide only point estimates of unknown parameters.
We take a Bayesian perspective and construct posterior distributions of game parameters.
This structured VAE can be trained from an unlabeled dataset of observed interactions.
arXiv Detail & Related papers (2024-02-14T02:17:37Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - DySMHO: Data-Driven Discovery of Governing Equations for Dynamical
Systems via Moving Horizon Optimization [77.34726150561087]
We introduce Discovery of Dynamical Systems via Moving Horizon Optimization (DySMHO), a scalable machine learning framework.
DySMHO sequentially learns the underlying governing equations from a large dictionary of basis functions.
Canonical nonlinear dynamical system examples are used to demonstrate that DySMHO can accurately recover the governing laws.
arXiv Detail & Related papers (2021-07-30T20:35:03Z) - Deep Policy Networks for NPC Behaviors that Adapt to Changing Design
Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL)
We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z) - No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium [76.78447814623665]
We give the first uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games.
We introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games.
Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions.
arXiv Detail & Related papers (2020-04-01T17:39:00Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.