Reward Shaping for Improved Learning in Real-time Strategy Game Play
- URL: http://arxiv.org/abs/2311.16339v1
- Date: Mon, 27 Nov 2023 21:56:18 GMT
- Title: Reward Shaping for Improved Learning in Real-time Strategy Game Play
- Authors: John Kliem and Prithviraj Dasgupta
- Abstract summary: We show that appropriately designed reward shaping functions can significantly improve the player's performance.
We have validated our reward shaping functions within a simulated environment for playing a marine capture-the-flag game.
- Score: 0.3347089492811693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate the effect of reward shaping in improving the performance of
reinforcement learning in the context of the real-time strategy,
capture-the-flag game. The game is characterized by sparse rewards that are
associated with infrequently occurring events such as grabbing or capturing the
flag, or tagging the opposing player. We show that appropriately designed
reward shaping functions applied to different game events can significantly
improve the player's performance and training times of the player's learning
algorithm. We have validated our reward shaping functions within a simulated
environment for playing a marine capture-the-flag game between two players. Our
experimental results demonstrate that reward shaping can be used as an
effective means to understand the importance of different sub-tasks during
game-play towards winning the game, to encode a secondary objective functions
such as energy efficiency into a player's game-playing behavior, and, to
improve learning generalizable policies that can perform well against different
skill levels of the opponent.
Related papers
- Enhancing Two-Player Performance Through Single-Player Knowledge Transfer: An Empirical Study on Atari 2600 Games [1.03590082373586]
This study examines the proposed idea in ten different Atari 2600 environments using the Atari 2600 RAM as the input state.
We discuss the advantages of using transfer learning from a single-player training process over training in a two-player setting from scratch.
arXiv Detail & Related papers (2024-10-22T02:57:44Z) - Pixel to policy: DQN Encoders for within & cross-game reinforcement
learning [0.0]
Reinforcement Learning can be applied to various tasks, and environments.
Many environments have a similar structure, which can be exploited to improve RL performance on other tasks.
This work explores as well as compares the performance between RL models being trained from the scratch and on different approaches of transfer learning.
arXiv Detail & Related papers (2023-08-01T06:29:33Z) - Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep
Reinforcement Learning [0.0]
We present Lucy-SKG, a Reinforcement Learning-based model that learned how to play Rocket League in a sample-efficient manner.
Our contributions include the development of a reward analysis and visualization library, a novel parameterizable reward shape function, and auxiliary neural architectures.
arXiv Detail & Related papers (2023-05-25T07:33:17Z) - Understanding why shooters shoot -- An AI-powered engine for basketball
performance profiling [70.54015529131325]
Basketball is dictated by many variables, such as playstyle and game dynamics.
It is crucial that the performance profiles can reflect the diverse playstyles.
We present a tool that can visualize player performance profiles in a timely manner.
arXiv Detail & Related papers (2023-03-17T01:13:18Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - Adversarial Motion Priors Make Good Substitutes for Complex Reward
Functions [124.11520774395748]
Reinforcement learning practitioners often utilize complex reward functions that encourage physically plausible behaviors.
We propose substituting complex reward functions with "style rewards" learned from a dataset of motion capture demonstrations.
A learned style reward can be combined with an arbitrary task reward to train policies that perform tasks using naturalistic strategies.
arXiv Detail & Related papers (2022-03-28T21:17:36Z) - Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes)
We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play.
Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z) - Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards
for Real-time Strategy Games [0.0]
Training agents using Reinforcement Learning in games with sparse rewards is a challenging problem.
We present a novel technique that successfully trains agents to eventually optimize the true objective in games with sparse rewards.
arXiv Detail & Related papers (2020-10-05T03:43:06Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.