Related papers: AlphaZero Gomoku

AlphaZero Gomoku

URL: http://arxiv.org/abs/2309.01294v1
Date: Mon, 4 Sep 2023 00:20:06 GMT
Title: AlphaZero Gomoku
Authors: Wen Liang, Chao Yu, Brian Whiteaker, Inyoung Huh, Hua Shao, Youzhi Liang
Abstract summary: We broaden the use of AlphaZero to Gomoku, an age-old tactical board game also referred to as "Five in a Row" Our tests demonstrate AlphaZero's versatility in adapting to games other than Go.
Score: 9.434566356382529
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the past few years, AlphaZero's exceptional capability in mastering intricate board games has garnered considerable interest. Initially designed for the game of Go, this revolutionary algorithm merges deep learning techniques with the Monte Carlo tree search (MCTS) to surpass earlier top-tier methods. In our study, we broaden the use of AlphaZero to Gomoku, an age-old tactical board game also referred to as "Five in a Row." Intriguingly, Gomoku has innate challenges due to a bias towards the initial player, who has a theoretical advantage. To add value, we strive for a balanced game-play. Our tests demonstrate AlphaZero's versatility in adapting to games other than Go. MCTS has become a predominant algorithm for decision processes in intricate scenarios, especially board games. MCTS creates a search tree by examining potential future actions and uses random sampling to predict possible results. By leveraging the best of both worlds, the AlphaZero technique fuses deep learning from Reinforcement Learning with the balancing act of MCTS, establishing a fresh standard in game-playing AI. Its triumph is notably evident in board games such as Go, chess, and shogi.

Related papers

MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games [9.339645051415115]
MiniZero is a zero-knowledge learning framework that supports four state-of-the-art algorithms. We evaluate the performance of each algorithm in two board games, 9x9 Go and 8x8 Othello, as well as 57 Atari games.
arXiv Detail & Related papers (2023-10-17T14:29:25Z)
Targeted Search Control in AlphaZero for Effective Policy Improvement [93.30151539224144]
We introduce Go-Exploit, a novel search control strategy for AlphaZero. Go-Exploit samples the start state of its self-play trajectories from an archive of states of interest. Go-Exploit learns with a greater sample efficiency than standard AlphaZero.
arXiv Detail & Related papers (2023-02-23T22:50:24Z)
Are AlphaZero-like Agents Robust to Adversarial Perturbations? [73.13944217915089]
AlphaZero (AZ) has demonstrated that neural-network-based Go AIs can surpass human performance by a large margin. We ask whether adversarial states exist for Go AIs that may lead them to play surprisingly wrong actions. We develop the first adversarial attack on Go AIs that can efficiently search for adversarial states by strategically reducing the search space.
arXiv Detail & Related papers (2022-11-07T18:43:25Z)
Exploring Adaptive MCTS with TD Learning in miniXCOM [0.0]
In this work, we explore on-line adaptivity in Monte Carlo tree search (MCTS) without requiring pre-training. We present MCTS-TD, an adaptive MCTS algorithm improved with temporal difference learning. We demonstrate our new approach on the game miniXCOM, a popular commercial franchise consisting of several turn-based tactical games.
arXiv Detail & Related papers (2022-10-10T21:04:25Z)
An AlphaZero-Inspired Approach to Solving Search Problems [63.24965775030674]
We adapt the methods and techniques used in AlphaZero for solving search problems. We describe possible representations in terms of easy-instance solvers and self-reductions. We also describe a version of Monte Carlo tree search adapted for search problems.
arXiv Detail & Related papers (2022-07-02T23:39:45Z)
AlphaZero-Inspired General Board Game Learning and Playing [0.0]
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning. In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents. We apply this new architecture to several complex games (Othello, ConnectFour, Rubik's Cube) and show the advantages achieved with this AlphaZero-inspired MCTS wrapper.
arXiv Detail & Related papers (2022-04-28T07:04:14Z)
Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes) We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play. Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z)
Combining Off and On-Policy Training in Model-Based Reinforcement Learning [77.34726150561087]
We propose a way to obtain off-policy targets using data from simulated games in MuZero. Our results show that these targets speed up the training process and lead to faster convergence and higher rewards.
arXiv Detail & Related papers (2021-02-24T10:47:26Z)
Mastering Terra Mystica: Applying Self-Play to Multi-agent Cooperative Board Games [0.0]
In this paper, we explore and compare multiple algorithms for solving the complex strategy game of Terra Mystica. We apply these breakthroughs to a novel state-representation of TM with the goal of creating an AI that will rival human players. In the end, we discuss the success and shortcomings of this method by comparing against multiple baselines and typical human scores.
arXiv Detail & Related papers (2021-02-21T07:53:34Z)
Online Learning in Unknown Markov Games [55.07327246187741]
We study online learning in unknown Markov games. We show that achieving sublinear regret against the best response in hindsight is statistically hard. We present an algorithm that achieves a sublinear $tildemathcalO(K2/3)$ regret after $K$ episodes.
arXiv Detail & Related papers (2020-10-28T14:52:15Z)
Suphx: Mastering Mahjong with Deep Reinforcement Learning [114.68233321904623]
We design an AI for Mahjong, named Suphx, based on deep reinforcement learning with some newly introduced techniques. Suphx has demonstrated stronger performance than most top human players in terms of stable rank. This is the first time that a computer program outperforms most top human players in Mahjong.
arXiv Detail & Related papers (2020-03-30T16:18:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.