On Efficient Reinforcement Learning for Full-length Game of StarCraft II
- URL: http://arxiv.org/abs/2209.11553v1
- Date: Fri, 23 Sep 2022 12:24:21 GMT
- Title: On Efficient Reinforcement Learning for Full-length Game of StarCraft II
- Authors: Ruo-Ze Liu, Zhen-Jia Pang, Zhou-Yu Meng, Wenhai Wang, Yang Yu, Tong Lu
- Abstract summary: We investigate a hierarchical RL approach involving extracted macro-actions and a hierarchical architecture of neural networks.
On a 64x64 map and using restrictive units, we achieve a win rate of 99% against the level-1 built-in AI.
We improve our architecture to train the agent against the cheating level AIs and achieve the win rate against the level-8, level-9, and level-10 AIs as 96%, 97%, and 94%, respectively.
- Score: 21.768578136029987
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: StarCraft II (SC2) poses a grand challenge for reinforcement learning (RL),
of which the main difficulties include huge state space, varying action space,
and a long time horizon. In this work, we investigate a set of RL techniques
for the full-length game of StarCraft II. We investigate a hierarchical RL
approach involving extracted macro-actions and a hierarchical architecture of
neural networks. We investigate a curriculum transfer training procedure and
train the agent on a single machine with 4 GPUs and 48 CPU threads. On a 64x64
map and using restrictive units, we achieve a win rate of 99% against the
level-1 built-in AI. Through the curriculum transfer learning algorithm and a
mixture of combat models, we achieve a 93% win rate against the most difficult
non-cheating level built-in AI (level-7). In this extended version of the
paper, we improve our architecture to train the agent against the cheating
level AIs and achieve the win rate against the level-8, level-9, and level-10
AIs as 96%, 97%, and 94%, respectively. Our codes are at
https://github.com/liuruoze/HierNet-SC2. To provide a baseline referring the
AlphaStar for our work as well as the research and open-source community, we
reproduce a scaled-down version of it, mini-AlphaStar (mAS). The latest version
of mAS is 1.07, which can be trained on the raw action space which has 564
actions. It is designed to run training on a single common machine, by making
the hyper-parameters adjustable. We then compare our work with mAS using the
same resources and show that our method is more effective. The codes of
mini-AlphaStar are at https://github.com/liuruoze/mini-AlphaStar. We hope our
study could shed some light on the future research of efficient reinforcement
learning on SC2 and other large-scale games.
Related papers
- Reinforcement Learning for High-Level Strategic Control in Tower Defense Games [47.618236610219554]
In strategy games, one of the most important aspects of game design is maintaining a sense of challenge for players.
We propose an automated approach that combines traditional scripted methods with reinforcement learning.
Results show that combining a learned approach, such as reinforcement learning, with a scripted AI produces a higher-performing and more robust agent than using only AI.
arXiv Detail & Related papers (2024-06-12T08:06:31Z) - DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan.
We first put forward an AI program named DanZero for this game.
In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z) - Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own [59.11934130045106]
We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models.
Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions.
Our method achieves remarkable performances in various manipulation tasks on both real robots and in simulation.
arXiv Detail & Related papers (2023-10-04T07:56:42Z) - AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning [38.75717733273262]
StarCraft II is one of the most challenging simulated reinforcement learning environments.
Blizzard has released a massive dataset of millions of StarCraft II games played by human players.
We define a dataset (a subset of Blizzard's release), tools standardizing an API for machine learning methods, and an evaluation protocol.
arXiv Detail & Related papers (2023-08-07T12:21:37Z) - DanZero: Mastering GuanDan Game with Reinforcement Learning [121.93690719186412]
Card game AI has always been a hot topic in the research of artificial intelligence.
In this paper, we are devoted to developing an AI program for a more complex card game, GuanDan.
We propose the first AI program DanZero for GuanDan using reinforcement learning technique.
arXiv Detail & Related papers (2022-10-31T06:29:08Z) - Mastering the Game of No-Press Diplomacy via Human-Regularized
Reinforcement Learning and Planning [95.78031053296513]
No-press Diplomacy is a complex strategy game involving both cooperation and competition.
We introduce a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy.
We show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL.
arXiv Detail & Related papers (2022-10-11T14:47:35Z) - Applying supervised and reinforcement learning methods to create
neural-network-based agents for playing StarCraft II [0.0]
We propose a neural network architecture for playing the full two-player match of StarCraft II trained with general-purpose supervised and reinforcement learning.
Our implementation achieves a non-trivial performance when compared to the in-game scripted bots.
arXiv Detail & Related papers (2021-09-26T20:08:10Z) - An Introduction of mini-AlphaStar [22.820438931820764]
An SC2 agent called AlphaStar is proposed which shows excellent performance, obtaining a high win-rates of 99.8% against Grandmaster level human players.
We implemented a mini-scaled version of it called mini-AlphaStar based on their paper and the pseudocode they provided.
The objective of mini-AlphaStar is to provide a reproduction of the original AlphaStar and facilitate the future research of RL on large-scale problems.
arXiv Detail & Related papers (2021-04-14T14:31:51Z) - SCC: an efficient deep reinforcement learning agent mastering the game
of StarCraft II [15.612456049715123]
AlphaStar, the AI that reaches GrandMaster level in StarCraft II, is a remarkable milestone demonstrating what deep reinforcement learning can achieve.
We propose a deep reinforcement learning agent, StarCraft Commander ( SCC)
SCC demonstrates top human performance defeating GrandMaster players in test matches and top professional players in a live event.
arXiv Detail & Related papers (2020-12-24T08:43:44Z) - TStarBot-X: An Open-Sourced and Comprehensive Study for Efficient League
Training in StarCraft II Full Game [25.248034258354533]
Recently, Google's DeepMind announced AlphaStar, a grandmaster level AI in StarCraft II that can play with humans using comparable action space and operations.
In this paper, we introduce a new AI agent, named TStarBot-X, that is trained under orders of less computations and can play competitively with expert human players.
arXiv Detail & Related papers (2020-11-27T13:31:49Z) - Provable Self-Play Algorithms for Competitive Reinforcement Learning [48.12602400021397]
We study self-play in competitive reinforcement learning under the setting of Markov games.
We show that a self-play algorithm achieves regret $tildemathcalO(sqrtT)$ after playing $T$ steps of the game.
We also introduce an explore-then-exploit style algorithm, which achieves a slightly worse regret $tildemathcalO(T2/3)$, but is guaranteed to run in time even in the worst case.
arXiv Detail & Related papers (2020-02-10T18:44:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.