Polygames: Improved Zero Learning
- URL: http://arxiv.org/abs/2001.09832v1
- Date: Mon, 27 Jan 2020 14:49:49 GMT
- Title: Polygames: Improved Zero Learning
- Authors: Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, Shi-Yu Chen, Xian-Dong
Chiu, Julien Dehos, Maria Elsa, Qucheng Gong, Hengyuan Hu, Vasil Khalidov,
Cheng-Ling Li, Hsin-I Lin, Yu-Jin Lin, Xavier Martinet, Vegard Mella, Jeremy
Rapin, Baptiste Roziere, Gabriel Synnaeve, Fabien Teytaud, Olivier Teytaud,
Shi-Cheng Ye, Yi-Jun Ye, Shi-Jim Yen, Sergey Zagoruyko
- Abstract summary: Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games.
We release Polygames, our framework for Zero learning, with its library of games and its checkpoints.
We won against strong humans at the game of Hex in 19x19, which was often said to be untractable for zero learning.
- Score: 21.114734326593002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art
method for many board games. It can be improved using a fully convolutional
structure (no fully connected layer). Using such an architecture plus global
pooling, we can create bots independent of the board size. The training can be
made more robust by keeping track of the best checkpoints during the training
and by training against them. Using these features, we release Polygames, our
framework for Zero learning, with its library of games and its checkpoints. We
won against strong humans at the game of Hex in 19x19, which was often said to
be untractable for zero learning; and in Havannah. We also won several first
places at the TAAI competitions.
Related papers
- Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games [9.339645051415115]
MiniZero is a zero-knowledge learning framework that supports four state-of-the-art algorithms.
We evaluate the performance of each algorithm in two board games, 9x9 Go and 8x8 Othello, as well as 57 Atari games.
arXiv Detail & Related papers (2023-10-17T14:29:25Z) - Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with
Subgame Curriculum Learning [65.36326734799587]
We present a novel subgame curriculum learning framework for zero-sum games.
It adopts an adaptive initial state distribution by resetting agents to some previously visited states.
We derive a subgame selection metric that approximates the squared distance to NE values.
arXiv Detail & Related papers (2023-10-07T13:09:37Z) - AlphaZero Gomoku [9.434566356382529]
We broaden the use of AlphaZero to Gomoku, an age-old tactical board game also referred to as "Five in a Row"
Our tests demonstrate AlphaZero's versatility in adapting to games other than Go.
arXiv Detail & Related papers (2023-09-04T00:20:06Z) - Targeted Search Control in AlphaZero for Effective Policy Improvement [93.30151539224144]
We introduce Go-Exploit, a novel search control strategy for AlphaZero.
Go-Exploit samples the start state of its self-play trajectories from an archive of states of interest.
Go-Exploit learns with a greater sample efficiency than standard AlphaZero.
arXiv Detail & Related papers (2023-02-23T22:50:24Z) - DanZero: Mastering GuanDan Game with Reinforcement Learning [121.93690719186412]
Card game AI has always been a hot topic in the research of artificial intelligence.
In this paper, we are devoted to developing an AI program for a more complex card game, GuanDan.
We propose the first AI program DanZero for GuanDan using reinforcement learning technique.
arXiv Detail & Related papers (2022-10-31T06:29:08Z) - AlphaZero-Inspired General Board Game Learning and Playing [0.0]
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning.
In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents.
We apply this new architecture to several complex games (Othello, ConnectFour, Rubik's Cube) and show the advantages achieved with this AlphaZero-inspired MCTS wrapper.
arXiv Detail & Related papers (2022-04-28T07:04:14Z) - Train on Small, Play the Large: Scaling Up Board Games with AlphaZero
and GNN [23.854093182195246]
Playing board games is considered a major challenge for both humans and AI researchers.
In this work, we look at the board as a graph and combine a graph neural network architecture inside the AlphaZero framework.
Our model can be trained quickly to play different challenging board games on multiple board sizes, without using any domain knowledge.
arXiv Detail & Related papers (2021-07-18T08:36:00Z) - DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning [65.00325925262948]
We propose a conceptually simple yet effective DouDizhu AI system, namely DouZero.
DouZero enhances traditional Monte-Carlo methods with deep neural networks, action encoding, and parallel actors.
It was ranked the first in the Botzone leaderboard among 344 AI agents.
arXiv Detail & Related papers (2021-06-11T02:45:51Z) - Combining Off and On-Policy Training in Model-Based Reinforcement
Learning [77.34726150561087]
We propose a way to obtain off-policy targets using data from simulated games in MuZero.
Our results show that these targets speed up the training process and lead to faster convergence and higher rewards.
arXiv Detail & Related papers (2021-02-24T10:47:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.