Related papers: Evaluating Generalisation in General Video Game Playing

Evaluating Generalisation in General Video Game Playing

URL: http://arxiv.org/abs/2005.11247v1
Date: Fri, 22 May 2020 15:57:52 GMT
Title: Evaluating Generalisation in General Video Game Playing
Authors: Martin Balla and Simon M. Lucas and Diego Perez-Liebana
Abstract summary: This paper focuses on the challenge of the GVGAI learning track in which 3 games are selected and 2 levels are given for training, while 3 hidden levels are left for evaluation. This setup poses a difficult challenge for current Reinforcement Learning (RL) algorithms, as they typically require much more data. This work investigates 3 versions of the Advantage Actor-Critic (A2C) algorithm trained on a maximum of 2 levels from the available 5 from the GVGAI framework and compares their performance on all levels.
Score: 1.160208922584163
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The General Video Game Artificial Intelligence (GVGAI) competition has been running for several years with various tracks. This paper focuses on the challenge of the GVGAI learning track in which 3 games are selected and 2 levels are given for training, while 3 hidden levels are left for evaluation. This setup poses a difficult challenge for current Reinforcement Learning (RL) algorithms, as they typically require much more data. This work investigates 3 versions of the Advantage Actor-Critic (A2C) algorithm trained on a maximum of 2 levels from the available 5 from the GVGAI framework and compares their performance on all levels. The selected sub-set of games have different characteristics, like stochasticity, reward distribution and objectives. We found that stochasticity improves the generalisation, but too much can cause the algorithms to fail to learn the training levels. The quality of the training levels also matters, different sets of training levels can boost generalisation over all levels. In the GVGAI competition agents are scored based on their win rates and then their scores achieved in the games. We found that solely using the rewards provided by the game might not encourage winning.

Related papers

Mastering Chinese Chess AI (Xiangqi) Without Search [2.309569018066392]
We have developed a high-performance Chinese Chess AI that operates without reliance on search algorithms. This AI has demonstrated the capability to compete at a level commensurate with the top 0.1% of human players.
arXiv Detail & Related papers (2024-10-07T09:27:51Z)
Reinforcement Learning for High-Level Strategic Control in Tower Defense Games [47.618236610219554]
In strategy games, one of the most important aspects of game design is maintaining a sense of challenge for players. We propose an automated approach that combines traditional scripted methods with reinforcement learning. Results show that combining a learned approach, such as reinforcement learning, with a scripted AI produces a higher-performing and more robust agent than using only AI.
arXiv Detail & Related papers (2024-06-12T08:06:31Z)
DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan. We first put forward an AI program named DanZero for this game. In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z)
Retrospective on the 2021 BASALT Competition on Learning from Human Feedback [92.37243979045817]
The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft. Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types.
arXiv Detail & Related papers (2022-04-14T17:24:54Z)
No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning. We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures. Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z)
Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes) We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play. Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z)
Reinforcement Learning with Dual-Observation for General Video Game Playing [12.33685708449853]
General Video Game AI Learning Competition aims to develop agents capable of learning to play different game levels unseen during training. This paper summarises the five years' General Video Game AI Learning Competition editions. We present a novel reinforcement learning technique with dual-observation for general video game playing.
arXiv Detail & Related papers (2020-11-11T08:28:20Z)
Testing match-3 video games with Deep Reinforcement Learning [0.0]
We study the possibility to use the Deep Reinforcement Learning to automate the testing process in match-3 video games. We test this kind of network on the Jelly Juice game, a match-3 video game developed by the redBit Games.
arXiv Detail & Related papers (2020-06-30T12:41:35Z)
Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.