Related papers: Fly, Fail, Fix: Iterative Game Repair with Reinforcement Learning and Large Multimodal Models

Fly, Fail, Fix: Iterative Game Repair with Reinforcement Learning and Large Multimodal Models

URL: http://arxiv.org/abs/2507.12666v1
Date: Wed, 16 Jul 2025 22:45:40 GMT
Title: Fly, Fail, Fix: Iterative Game Repair with Reinforcement Learning and Large Multimodal Models
Authors: Alex Zook, Josef Spjut, Jonathan Tremblay,
Abstract summary: Game design hinges on understanding how static rules and content translate into dynamic player behavior.<n>We present an automated design framework that closes this gap by pairing a reinforcement learning (RL) agent, which playtests the game, with a large multimodal model (LMM)<n>The LMM designer receives a gameplay goal and the current game configuration, analyses the play traces, and edits the configuration to steer future behaviour toward the goal.
Score: 7.989185500830854
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Game design hinges on understanding how static rules and content translate into dynamic player behavior - something modern generative systems that inspect only a game's code or assets struggle to capture. We present an automated design iteration framework that closes this gap by pairing a reinforcement learning (RL) agent, which playtests the game, with a large multimodal model (LMM), which revises the game based on what the agent does. In each loop the RL player completes several episodes, producing (i) numerical play metrics and/or (ii) a compact image strip summarising recent video frames. The LMM designer receives a gameplay goal and the current game configuration, analyses the play traces, and edits the configuration to steer future behaviour toward the goal. We demonstrate results that LMMs can reason over behavioral traces supplied by RL agents to iteratively refine game mechanics, pointing toward practical, scalable tools for AI-assisted game design.

Related papers

Play to Generalize: Learning to Reason Through Game Play [11.778612579151067]
We propose a novel post-training paradigm, Visual Game Learning, where MLLMs develop out-of-domain generalization of multimodal reasoning through playing arcade-like games.<n>Our findings suggest a new post-training paradigm: synthetic, rule-based games can serve as controllable and scalable pre-text tasks.
arXiv Detail & Related papers (2025-06-09T17:59:57Z)
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games [16.187737674778234]
We present textbfbenchname, a benchmark designed to train and evaluate Large Language Model (LLM) agents across diverse real-world video games.<n>To support consistent evaluation of LLMs, we introduce a plug-and-play interface based on Model Context Protocol (MCP)<n>Orak offers a comprehensive evaluation framework, encompassing general game score leaderboards, LLM battle arenas, and in-depth analyses of visual input state, agentic strategies, and fine-tuning effects.
arXiv Detail & Related papers (2025-06-04T06:40:33Z)
Scaling Laws for Imitation Learning in Single-Agent Games [28.257046559127875]
We investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.<n>We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack.<n>We find that IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents.
arXiv Detail & Related papers (2023-07-18T16:43:03Z)
Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models [68.85478477006178]
We present a Promptable Game Model (PGM) for neural video game simulators. It allows a user to play the game by prompting it with high- and low-level action sequences. Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt. Our method significantly outperforms existing neural video game simulators in terms of rendering quality and unlocks applications beyond the capabilities of the current state of the art.
arXiv Detail & Related papers (2023-03-23T17:43:17Z)
Automated Play-Testing Through RL Based Human-Like Play-Styles Generation [0.0]
Reinforcement Learning is a promising answer to the need of automating video game testing. We present CARMI: a. Agent with Relative Metrics as Input. An agent able to emulate the players play-styles, even on previously unseen levels.
arXiv Detail & Related papers (2022-11-29T14:17:20Z)
Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance. We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning. We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z)
Deep Policy Networks for NPC Behaviors that Adapt to Changing Design Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL) We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z)
DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games [137.86426963572214]
We introduce DeepCrawl, a fully-playable Roguelike prototype for iOS and Android in which all agents are controlled by policy networks trained using Deep Reinforcement Learning (DRL) Our aim is to understand whether recent advances in DRL can be used to develop convincing behavioral models for non-player characters in videogames.
arXiv Detail & Related papers (2020-12-03T13:53:29Z)
Metagame Autobalancing for Competitive Multiplayer Games [0.10499611180329801]
We present a tool for balancing multi-player games during game design. Our approach requires a designer to construct an intuitive graphical representation of their meta-game target. We show the capabilities of this tool on examples inheriting from Rock-Paper-Scissors, and on a more complex asymmetric fighting game.
arXiv Detail & Related papers (2020-06-08T08:55:30Z)
Learning to Simulate Dynamic Environments with GameGAN [109.25308647431952]
In this paper, we aim to learn a simulator by simply watching an agent interact with an environment. We introduce GameGAN, a generative model that learns to visually imitate a desired game by ingesting screenplay and keyboard actions during training.
arXiv Detail & Related papers (2020-05-25T14:10:17Z)
Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning [82.25034245150582]
In many vision-based reinforcement learning problems, the agent controls a movable object in its visual field. We propose an end-to-end learning framework to disentangle the controllable object from the observation signal. The disentangled representation is shown to be useful for RL as additional observation channels to the agent.
arXiv Detail & Related papers (2020-02-21T05:43:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.