Related papers: Polygames: Improved Zero Learning

Polygames: Improved Zero Learning

URL: http://arxiv.org/abs/2001.09832v1
Date: Mon, 27 Jan 2020 14:49:49 GMT
Title: Polygames: Improved Zero Learning
Authors: Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, Shi-Yu Chen, Xian-Dong Chiu, Julien Dehos, Maria Elsa, Qucheng Gong, Hengyuan Hu, Vasil Khalidov, Cheng-Ling Li, Hsin-I Lin, Yu-Jin Lin, Xavier Martinet, Vegard Mella, Jeremy Rapin, Baptiste Roziere, Gabriel Synnaeve, Fabien Teytaud, Olivier Teytaud, Shi-Cheng Ye, Yi-Jun Ye, Shi-Jim Yen, Sergey Zagoruyko
Abstract summary: Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. We release Polygames, our framework for Zero learning, with its library of games and its checkpoints. We won against strong humans at the game of Hex in 19x19, which was often said to be untractable for zero learning.
Score: 21.114734326593002
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by keeping track of the best checkpoints during the training and by training against them. Using these features, we release Polygames, our framework for Zero learning, with its library of games and its checkpoints. We won against strong humans at the game of Hex in 19x19, which was often said to be untractable for zero learning; and in Havannah. We also won several first places at the TAAI competitions.

Related papers

Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z)
MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games [9.339645051415115]
MiniZero is a zero-knowledge learning framework that supports four state-of-the-art algorithms. We evaluate the performance of each algorithm in two board games, 9x9 Go and 8x8 Othello, as well as 57 Atari games.
arXiv Detail & Related papers (2023-10-17T14:29:25Z)
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning [65.36326734799587]
We present a novel subgame curriculum learning framework for zero-sum games. It adopts an adaptive initial state distribution by resetting agents to some previously visited states. We derive a subgame selection metric that approximates the squared distance to NE values.
arXiv Detail & Related papers (2023-10-07T13:09:37Z)
AlphaZero Gomoku [9.434566356382529]
We broaden the use of AlphaZero to Gomoku, an age-old tactical board game also referred to as "Five in a Row" Our tests demonstrate AlphaZero's versatility in adapting to games other than Go.
arXiv Detail & Related papers (2023-09-04T00:20:06Z)
Targeted Search Control in AlphaZero for Effective Policy Improvement [93.30151539224144]
We introduce Go-Exploit, a novel search control strategy for AlphaZero. Go-Exploit samples the start state of its self-play trajectories from an archive of states of interest. Go-Exploit learns with a greater sample efficiency than standard AlphaZero.
arXiv Detail & Related papers (2023-02-23T22:50:24Z)
DanZero: Mastering GuanDan Game with Reinforcement Learning [121.93690719186412]
Card game AI has always been a hot topic in the research of artificial intelligence. In this paper, we are devoted to developing an AI program for a more complex card game, GuanDan. We propose the first AI program DanZero for GuanDan using reinforcement learning technique.
arXiv Detail & Related papers (2022-10-31T06:29:08Z)
AlphaZero-Inspired General Board Game Learning and Playing [0.0]
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning. In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents. We apply this new architecture to several complex games (Othello, ConnectFour, Rubik's Cube) and show the advantages achieved with this AlphaZero-inspired MCTS wrapper.
arXiv Detail & Related papers (2022-04-28T07:04:14Z)
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero and GNN [23.854093182195246]
Playing board games is considered a major challenge for both humans and AI researchers. In this work, we look at the board as a graph and combine a graph neural network architecture inside the AlphaZero framework. Our model can be trained quickly to play different challenging board games on multiple board sizes, without using any domain knowledge.
arXiv Detail & Related papers (2021-07-18T08:36:00Z)
DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning [65.00325925262948]
We propose a conceptually simple yet effective DouDizhu AI system, namely DouZero. DouZero enhances traditional Monte-Carlo methods with deep neural networks, action encoding, and parallel actors. It was ranked the first in the Botzone leaderboard among 344 AI agents.
arXiv Detail & Related papers (2021-06-11T02:45:51Z)
Combining Off and On-Policy Training in Model-Based Reinforcement Learning [77.34726150561087]
We propose a way to obtain off-policy targets using data from simulated games in MuZero. Our results show that these targets speed up the training process and lead to faster convergence and higher rewards.
arXiv Detail & Related papers (2021-02-24T10:47:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.