AlphaZero Gomoku
- URL: http://arxiv.org/abs/2309.01294v1
- Date: Mon, 4 Sep 2023 00:20:06 GMT
- Title: AlphaZero Gomoku
- Authors: Wen Liang, Chao Yu, Brian Whiteaker, Inyoung Huh, Hua Shao, Youzhi
Liang
- Abstract summary: We broaden the use of AlphaZero to Gomoku, an age-old tactical board game also referred to as "Five in a Row"
Our tests demonstrate AlphaZero's versatility in adapting to games other than Go.
- Score: 9.434566356382529
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past few years, AlphaZero's exceptional capability in mastering
intricate board games has garnered considerable interest. Initially designed
for the game of Go, this revolutionary algorithm merges deep learning
techniques with the Monte Carlo tree search (MCTS) to surpass earlier top-tier
methods. In our study, we broaden the use of AlphaZero to Gomoku, an age-old
tactical board game also referred to as "Five in a Row." Intriguingly, Gomoku
has innate challenges due to a bias towards the initial player, who has a
theoretical advantage. To add value, we strive for a balanced game-play. Our
tests demonstrate AlphaZero's versatility in adapting to games other than Go.
MCTS has become a predominant algorithm for decision processes in intricate
scenarios, especially board games. MCTS creates a search tree by examining
potential future actions and uses random sampling to predict possible results.
By leveraging the best of both worlds, the AlphaZero technique fuses deep
learning from Reinforcement Learning with the balancing act of MCTS,
establishing a fresh standard in game-playing AI. Its triumph is notably
evident in board games such as Go, chess, and shogi.
Related papers
- MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games [9.339645051415115]
MiniZero is a zero-knowledge learning framework that supports four state-of-the-art algorithms.
We evaluate the performance of each algorithm in two board games, 9x9 Go and 8x8 Othello, as well as 57 Atari games.
arXiv Detail & Related papers (2023-10-17T14:29:25Z) - Targeted Search Control in AlphaZero for Effective Policy Improvement [93.30151539224144]
We introduce Go-Exploit, a novel search control strategy for AlphaZero.
Go-Exploit samples the start state of its self-play trajectories from an archive of states of interest.
Go-Exploit learns with a greater sample efficiency than standard AlphaZero.
arXiv Detail & Related papers (2023-02-23T22:50:24Z) - Are AlphaZero-like Agents Robust to Adversarial Perturbations? [73.13944217915089]
AlphaZero (AZ) has demonstrated that neural-network-based Go AIs can surpass human performance by a large margin.
We ask whether adversarial states exist for Go AIs that may lead them to play surprisingly wrong actions.
We develop the first adversarial attack on Go AIs that can efficiently search for adversarial states by strategically reducing the search space.
arXiv Detail & Related papers (2022-11-07T18:43:25Z) - Exploring Adaptive MCTS with TD Learning in miniXCOM [0.0]
In this work, we explore on-line adaptivity in Monte Carlo tree search (MCTS) without requiring pre-training.
We present MCTS-TD, an adaptive MCTS algorithm improved with temporal difference learning.
We demonstrate our new approach on the game miniXCOM, a popular commercial franchise consisting of several turn-based tactical games.
arXiv Detail & Related papers (2022-10-10T21:04:25Z) - An AlphaZero-Inspired Approach to Solving Search Problems [63.24965775030674]
We adapt the methods and techniques used in AlphaZero for solving search problems.
We describe possible representations in terms of easy-instance solvers and self-reductions.
We also describe a version of Monte Carlo tree search adapted for search problems.
arXiv Detail & Related papers (2022-07-02T23:39:45Z) - AlphaZero-Inspired General Board Game Learning and Playing [0.0]
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning.
In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents.
We apply this new architecture to several complex games (Othello, ConnectFour, Rubik's Cube) and show the advantages achieved with this AlphaZero-inspired MCTS wrapper.
arXiv Detail & Related papers (2022-04-28T07:04:14Z) - Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes)
We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play.
Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z) - Combining Off and On-Policy Training in Model-Based Reinforcement
Learning [77.34726150561087]
We propose a way to obtain off-policy targets using data from simulated games in MuZero.
Our results show that these targets speed up the training process and lead to faster convergence and higher rewards.
arXiv Detail & Related papers (2021-02-24T10:47:26Z) - Mastering Terra Mystica: Applying Self-Play to Multi-agent Cooperative
Board Games [0.0]
In this paper, we explore and compare multiple algorithms for solving the complex strategy game of Terra Mystica.
We apply these breakthroughs to a novel state-representation of TM with the goal of creating an AI that will rival human players.
In the end, we discuss the success and shortcomings of this method by comparing against multiple baselines and typical human scores.
arXiv Detail & Related papers (2021-02-21T07:53:34Z) - Suphx: Mastering Mahjong with Deep Reinforcement Learning [114.68233321904623]
We design an AI for Mahjong, named Suphx, based on deep reinforcement learning with some newly introduced techniques.
Suphx has demonstrated stronger performance than most top human players in terms of stable rank.
This is the first time that a computer program outperforms most top human players in Mahjong.
arXiv Detail & Related papers (2020-03-30T16:18:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.