Related papers: Curiosity Driven Multi-agent Reinforcement Learning for 3D Game Testing

Curiosity Driven Multi-agent Reinforcement Learning for 3D Game Testing

URL: http://arxiv.org/abs/2502.14606v1
Date: Thu, 20 Feb 2025 14:43:46 GMT
Title: Curiosity Driven Multi-agent Reinforcement Learning for 3D Game Testing
Authors: Raihana Ferdous, Fitsum Kifetew, Davide Prandi, Angelo Susi,
Abstract summary: cMarlTest is an approach for testing 3D games through curiosity driven Multi-Agent Reinforcement Learning (MARL)<n>We carried out experiments on different levels of a 3D game comparing the performance of cMarlTest with a single agent RL variant.
Score: 1.2233362977312945
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently testing of games via autonomous agents has shown great promise in tackling challenges faced by the game industry, which mainly relied on either manual testing or record/replay. In particular Reinforcement Learning (RL) solutions have shown potential by learning directly from playing the game without the need for human intervention. In this paper, we present cMarlTest, an approach for testing 3D games through curiosity driven Multi-Agent Reinforcement Learning (MARL). cMarlTest deploys multiple agents that work collaboratively to achieve the testing objective. The use of multiple agents helps resolve issues faced by a single agent approach. We carried out experiments on different levels of a 3D game comparing the performance of cMarlTest with a single agent RL variant. Results are promising where, considering three different types of coverage criteria, cMarlTest achieved higher coverage. cMarlTest was also more efficient in terms of the time taken, with respect to the single agent based variant.

Related papers

Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests [89.09172401497213]
We examine three evaluation paradigms: large question-answering benchmarks, interactive games, and cognitive tests.<n>We compile a suite of targeted tests that measure cognitive abilities deemed essential for effective language use.<n>Our analyses reveal that interactive games are superior to standard benchmarks in discriminating models.
arXiv Detail & Related papers (2025-02-20T08:36:58Z)
Cooperative Multi-agent Approach for Automated Computer Game Testing [1.4931265249949526]
Many games nowadays are multi-player. This opens up an interesting possibility to deploy multiple cooperative test agents to test such a game. This paper offers a cooperative multi-agent testing approach and a study of its performance based on a case study on a 3D game called Lab Recruits.
arXiv Detail & Related papers (2024-05-18T17:31:26Z)
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation [96.71370747681078]
We introduce MLAgentBench, a suite of 13 tasks ranging from improving model performance on CIFAR-10 to recent research problems like BabyLM. For each task, an agent can perform actions like reading/writing files, executing code, and inspecting outputs. We benchmark agents based on Claude v1.0, Claude v2.1, Claude v3 Opus, GPT-4, GPT-4-turbo, Gemini-Pro, and Mixtral and find that a Claude v3 Opus agent is the best in terms of success rate.
arXiv Detail & Related papers (2023-10-05T04:06:12Z)
SmartPlay: A Benchmark for LLMs as Intelligent Agents [45.76707302899935]
SmartPlay consists of 6 different games, including Rock-Paper-Scissors, Tower of Hanoi, Minecraft. Each game challenges a subset of 9 important capabilities of an intelligent LLM agent. Tests include reasoning with object dependencies, planning ahead, spatial reasoning, learning from history, and understanding randomness.
arXiv Detail & Related papers (2023-10-02T18:52:11Z)
Preference-conditioned Pixel-based AI Agent For Game Testing [1.5059676044537105]
Game-testing AI agents that learn by interaction with the environment have the potential to mitigate these challenges. This paper proposes an agent design that mainly depends on pixel-based state observations while exploring the environment conditioned on a user's preference. Our agent significantly outperforms state-of-the-art pixel-based game testing agents over exploration coverage and test execution quality when evaluated on a complex open-world environment resembling many aspects of real AAA games.
arXiv Detail & Related papers (2023-08-18T04:19:36Z)
Centralized control for multi-agent RL in a complex Real-Time-Strategy game [0.0]
Multi-agent Reinforcement learning (MARL) studies the behaviour of multiple learning agents that coexist in a shared environment. MARL is more challenging than single-agent RL because it involves more complex learning dynamics. This project provides the end-to-end experience of applying RL in the Lux AI v2 Kaggle competition.
arXiv Detail & Related papers (2023-04-25T17:19:05Z)
UKP-SQuARE v3: A Platform for Multi-Agent QA Research [48.92308487624824]
We extend UKP-SQuARE, an online platform for Question Answering (QA) research, to support three families of multi-agent systems. We conduct experiments to evaluate their inference speed and discuss the performance vs. speed trade-off compared to multi-dataset models.
arXiv Detail & Related papers (2023-03-31T15:07:36Z)
Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent. We propose a novel episodic memory, LeGEM, for model-free MARL algorithms. We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z)
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot [71.28884625011987]
Melting Pot is a MARL evaluation suite that uses reinforcement learning to reduce the human labor required to create novel test scenarios. We have created over 80 unique test scenarios covering a broad range of research topics. We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.
arXiv Detail & Related papers (2021-07-14T17:22:14Z)
Augmenting Automated Game Testing with Deep Reinforcement Learning [0.4129225533930966]
General game testing relies on the use of human play testers, play test scripting, and prior knowledge of areas of interest to produce relevant test data. We introduce a self-learning mechanism to the game testing framework using deep reinforcement learning (DRL) DRL can be used to increase test coverage, find exploits, test map difficulty, and to detect common problems that arise in the testing of first-person shooter (FPS) games.
arXiv Detail & Related papers (2021-03-29T11:55:15Z)
Multi-Agent Collaboration via Reward Attribution Decomposition [75.36911959491228]
We propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge. CollaQ is evaluated on various StarCraft Attribution maps and shows that it outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-10-16T17:42:11Z)
The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition [14.726566410348985]
The Multi-Agent Reinforcement Learning in Malm"O (MARL"O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types.
arXiv Detail & Related papers (2019-01-23T21:01:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.