Deep RL Agent for a Real-Time Action Strategy Game
- URL: http://arxiv.org/abs/2002.06290v1
- Date: Sat, 15 Feb 2020 01:09:56 GMT
- Title: Deep RL Agent for a Real-Time Action Strategy Game
- Authors: Michal Warchalski, Dimitrije Radojevic, Milos Milosevic
- Abstract summary: We introduce a reinforcement learning environment based on Heroic - Magic Duel, a 1 v 1 action strategy game.
Our main contribution is a deep reinforcement learning agent playing the game at a competitive level.
Our best self-play agent, obtains around $65%$ win rate against the existing AI and over $50%$ win rate against a top human player.
- Score: 0.3867363075280543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a reinforcement learning environment based on Heroic - Magic
Duel, a 1 v 1 action strategy game. This domain is non-trivial for several
reasons: it is a real-time game, the state space is large, the information
given to the player before and at each step of a match is imperfect, and
distribution of actions is dynamic. Our main contribution is a deep
reinforcement learning agent playing the game at a competitive level that we
trained using PPO and self-play with multiple competing agents, employing only
a simple reward of $\pm 1$ depending on the outcome of a single match. Our best
self-play agent, obtains around $65\%$ win rate against the existing AI and
over $50\%$ win rate against a top human player.
Related papers
- Mastering the Game of No-Press Diplomacy via Human-Regularized
Reinforcement Learning and Planning [95.78031053296513]
No-press Diplomacy is a complex strategy game involving both cooperation and competition.
We introduce a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy.
We show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL.
arXiv Detail & Related papers (2022-10-11T14:47:35Z) - Mastering the Game of Stratego with Model-Free Multiagent Reinforcement
Learning [86.37438204416435]
Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered.
Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome.
DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform.
arXiv Detail & Related papers (2022-06-30T15:53:19Z) - Reinforcement Learning Agents in Colonel Blotto [0.0]
We focus on a specific instance of agent-based models, which uses reinforcement learning (RL) to train the agent how to act in its environment.
We find that the RL agent handily beats a single opponent, and still performs quite well when the number of opponents are increased.
We also analyze the RL agent and look at what strategies it has arrived by looking at the actions that it has given the highest and lowest Q-values.
arXiv Detail & Related papers (2022-04-04T16:18:01Z) - No-Press Diplomacy from Scratch [26.36204634856853]
We describe an algorithm for action exploration and equilibrium approximation in games with superhuman action spaces.
We train an agent, DORA, completely from scratch for a popular two-player variant of Diplomacy and show that it achieves superhuman performance.
We extend our methods to full-scale no-press Diplomacy and for the first time train an agent from scratch with no human data.
arXiv Detail & Related papers (2021-10-06T17:12:50Z) - Learning Monopoly Gameplay: A Hybrid Model-Free Deep Reinforcement
Learning and Imitation Learning Approach [31.066718635447746]
Reinforcement Learning (RL) relies on an agent interacting with an environment to maximize the cumulative sum of rewards received by it.
In multi-player Monopoly game, players have to make several decisions every turn which involves complex actions, such as making trades.
This paper introduces a Hybrid Model-Free Deep RL (DRL) approach that is capable of playing and learning winning strategies of Monopoly.
arXiv Detail & Related papers (2021-03-01T01:40:02Z) - Multi-Agent Collaboration via Reward Attribution Decomposition [75.36911959491228]
We propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge.
CollaQ is evaluated on various StarCraft Attribution maps and shows that it outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-10-16T17:42:11Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Deep Reinforcement Learning for FlipIt Security Game [2.0624765454705654]
We describe a deep learning model in which agents adapt to different classes of opponents and learn the optimal counter-strategy.
We apply our model to FlipIt, a two-player security game in which both players, the attacker and the defender, compete for ownership of a shared resource.
Our model is a deep neural network combined with Q-learning and is trained to maximize the defender's time of ownership of the resource.
arXiv Detail & Related papers (2020-02-28T18:26:24Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z) - Provable Self-Play Algorithms for Competitive Reinforcement Learning [48.12602400021397]
We study self-play in competitive reinforcement learning under the setting of Markov games.
We show that a self-play algorithm achieves regret $tildemathcalO(sqrtT)$ after playing $T$ steps of the game.
We also introduce an explore-then-exploit style algorithm, which achieves a slightly worse regret $tildemathcalO(T2/3)$, but is guaranteed to run in time even in the worst case.
arXiv Detail & Related papers (2020-02-10T18:44:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.