Final Adaptation Reinforcement Learning for N-Player Games
- URL: http://arxiv.org/abs/2111.14375v1
- Date: Mon, 29 Nov 2021 08:36:39 GMT
- Title: Final Adaptation Reinforcement Learning for N-Player Games
- Authors: Wolfgang Konen and Samineh Bagheri
- Abstract summary: This paper covers n-tuple-based reinforcement learning (RL) algorithms for games.
We present new algorithms for TD-, SARSA- and Q-learning which work seamlessly on various games with arbitrary number of players.
We add a new element called Final Adaptation RL (FARL) to all these algorithms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper covers n-tuple-based reinforcement learning (RL) algorithms for
games. We present new algorithms for TD-, SARSA- and Q-learning which work
seamlessly on various games with arbitrary number of players. This is achieved
by taking a player-centered view where each player propagates his/her rewards
back to previous rounds. We add a new element called Final Adaptation RL (FARL)
to all these algorithms. Our main contribution is that FARL is a vitally
important ingredient to achieve success with the player-centered view in
various games. We report results on seven board games with 1, 2 and 3 players,
including Othello, ConnectFour and Hex. In most cases it is found that FARL is
important to learn a near-perfect playing strategy. All algorithms are
available in the GBG framework on GitHub.
Related papers
- Deep Reinforcement Learning for 5*5 Multiplayer Go [6.222520876209623]
We propose to use and analyze the latest algorithms that use search and Deep Reinforcement Learning (DRL)
We show that using search and DRL we were able to improve the level of play, even though there are more than two players.
arXiv Detail & Related papers (2024-05-23T07:44:24Z) - Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM)
In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment.
Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z) - Learning in Mean Field Games: A Survey [44.93300994923148]
Mean Field Games (MFGs) rely on a mean-field approximation to allow the number of players to grow to infinity.
Recent literature on Reinforcement Learning methods to learnlibria and social optima in MFGs.
We present a general framework for classical iterative methods to solve MFGs in an exact way.
arXiv Detail & Related papers (2022-05-25T17:49:37Z) - AlphaZero-Inspired General Board Game Learning and Playing [0.0]
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning.
In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents.
We apply this new architecture to several complex games (Othello, ConnectFour, Rubik's Cube) and show the advantages achieved with this AlphaZero-inspired MCTS wrapper.
arXiv Detail & Related papers (2022-04-28T07:04:14Z) - Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the
Gap Between Learning in Extensive-Form and Normal-Form Games [76.21916750766277]
We show that the Optimistic Multiplicative Weights Update (OMWU) algorithm can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick.
Specifically, KOMWU gives the first algorithm that guarantees at the same time last-iterate convergence.
arXiv Detail & Related papers (2022-02-01T06:28:51Z) - No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning.
We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures.
Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z) - Can Reinforcement Learning Find Stackelberg-Nash Equilibria in
General-Sum Markov Games with Myopic Followers? [156.5760265539888]
We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers.
For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(pi*, nu*)$.
We develop sample-efficient reinforcement learning (RL) algorithms for solving for an SNE in both online and offline settings.
arXiv Detail & Related papers (2021-12-27T05:41:14Z) - Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games [31.97631243571394]
We introduce a framework, LMAC, that automates the discovery of the update rule without explicit human design.
Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance.
We show that LMAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO.
arXiv Detail & Related papers (2021-06-04T22:30:25Z) - Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes)
We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play.
Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z) - Combining Deep Reinforcement Learning and Search for
Imperfect-Information Games [30.520629802135574]
We present ReBeL, a framework for self-play reinforcement learning and search provably converges to a Nash equilibrium in zero-sum games.
We also show ReBeL achieves performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI.
arXiv Detail & Related papers (2020-07-27T15:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.