Co-Learning Empirical Games and World Models
- URL: http://arxiv.org/abs/2305.14223v1
- Date: Tue, 23 May 2023 16:37:21 GMT
- Title: Co-Learning Empirical Games and World Models
- Authors: Max Olan Smith, Michael P. Wellman
- Abstract summary: Empirical games drive world models toward a broader consideration of possible game dynamics.
World models guide empirical games to efficiently discover new strategies through planning.
A new algorithm, Dyna-PSRO, co-learns an empirical game and a world model.
- Score: 23.800790782022222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Game-based decision-making involves reasoning over both world dynamics and
strategic interactions among the agents. Typically, empirical models capturing
these respective aspects are learned and used separately. We investigate the
potential gain from co-learning these elements: a world model for dynamics and
an empirical game for strategic interactions. Empirical games drive world
models toward a broader consideration of possible game dynamics induced by a
diversity of strategy profiles. Conversely, world models guide empirical games
to efficiently discover new strategies through planning. We demonstrate these
benefits first independently, then in combination as realized by a new
algorithm, Dyna-PSRO, that co-learns an empirical game and a world model. When
compared to PSRO -- a baseline empirical-game building algorithm, Dyna-PSRO is
found to compute lower regret solutions on partially observable general-sum
games. In our experiments, Dyna-PSRO also requires substantially fewer
experiences than PSRO, a key algorithmic advantage for settings where
collecting player-game interaction data is a cost-limiting factor.
Related papers
- Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities.
One of the crucial factors to achieve success is aligning the LLM's output with human preferences.
We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z) - Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic
Decision-Making with AI Agents [77.34720446306419]
Alympics is a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.
Alympics creates a versatile platform for studying complex game theory problems.
arXiv Detail & Related papers (2023-11-06T16:03:46Z) - On a Connection between Differential Games, Optimal Control, and
Energy-based Models for Multi-Agent Interactions [0.13499500088995461]
We show a connection between differential games, optimal control, and energy-based models.
Building upon this formulation, this work introduces a new end-to-end learning application.
Experiments using simulated mobile robot pedestrian interactions and real-world automated driving data provide empirical evidence.
arXiv Detail & Related papers (2023-08-31T08:30:11Z) - Game Theoretic Rating in N-player general-sum games with Equilibria [26.166859475522106]
We propose novel algorithms suitable for N-player, general-sum rating of strategies in normal-form games according to the payoff rating system.
This enables well-established solution concepts, such as equilibria, to be leveraged to efficiently rate strategies in games with complex strategic interactions.
arXiv Detail & Related papers (2022-10-05T12:33:03Z) - Independent Learning in Stochastic Games [16.505046191280634]
We present the model of games for multi-agent learning in dynamic environments.
We focus on the development of simple and independent learning dynamics for games.
We present our recently proposed simple and independent learning dynamics that guarantee convergence in zero-sum games.
arXiv Detail & Related papers (2021-11-23T09:27:20Z) - Human-Level Reinforcement Learning through Theory-Based Modeling,
Exploration, and Planning [27.593497502386143]
Theory-Based Reinforcement Learning uses human-like intuitive theories to explore and model an environment.
We instantiate the approach in a video game playing agent called EMPA.
EMPA matches human learning efficiency on a suite of 90 Atari-style video games.
arXiv Detail & Related papers (2021-07-27T01:38:13Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z) - Model-Based Reinforcement Learning for Atari [89.3039240303797]
We show how video prediction models can enable agents to solve Atari games with fewer interactions than model-free methods.
Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment.
arXiv Detail & Related papers (2019-03-01T15:40:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.