Skill Check: Some Considerations on the Evaluation of Gamemastering
Models for Role-playing Games
- URL: http://arxiv.org/abs/2309.13702v2
- Date: Sat, 30 Sep 2023 18:56:04 GMT
- Title: Skill Check: Some Considerations on the Evaluation of Gamemastering
Models for Role-playing Games
- Authors: Santiago G\'ongora, Luis Chiruzzo, Gonzalo M\'endez, Pablo Gerv\'as
- Abstract summary: In role-playing games a Game Master (GM) is the player in charge of the game, who must design the challenges the players face and narrate the outcomes of their actions.
In this work we discuss some challenges to model GMs from an Interactive Storytelling and Natural Language Processing perspective.
Following those challenges we propose three test categories to evaluate such dialogue systems, and we use them to test ChatGPT, Bard and OpenAssistant as out-of-the-box GMs.
- Score: 0.5803309695504829
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In role-playing games a Game Master (GM) is the player in charge of the game,
who must design the challenges the players face and narrate the outcomes of
their actions. In this work we discuss some challenges to model GMs from an
Interactive Storytelling and Natural Language Processing perspective. Following
those challenges we propose three test categories to evaluate such dialogue
systems, and we use them to test ChatGPT, Bard and OpenAssistant as
out-of-the-box GMs.
Related papers
- A Dialogue Game for Eliciting Balanced Collaboration [64.61707514432533]
We present a two-player 2D object placement game in which the players must negotiate the goal state themselves.
We show empirically that human players exhibit a variety of role distributions, and that balanced collaboration improves task performance.
arXiv Detail & Related papers (2024-06-12T13:35:10Z) - GameEval: Evaluating LLMs on Conversational Games [93.40433639746331]
We propose GameEval, a novel approach to evaluating large language models (LLMs)
GameEval treats LLMs as game players and assigns them distinct roles with specific goals achieved by launching conversations of various forms.
We show that GameEval can effectively differentiate the capabilities of various LLMs, providing a comprehensive assessment of their integrated abilities to solve complex problems.
arXiv Detail & Related papers (2023-08-19T14:33:40Z) - CALYPSO: LLMs as Dungeon Masters' Assistants [46.61924662589895]
Large language models (LLMs) have shown remarkable abilities to generate coherent natural language text.
We introduce CALYPSO, a system of LLM-powered interfaces that support DMs with information and inspiration specific to their own scenario.
When given access to CALYPSO, DMs reported that it generated high-fidelity text suitable for direct presentation to players, and low-fidelity ideas that the DM could develop further while maintaining their creative agency.
arXiv Detail & Related papers (2023-08-15T02:57:00Z) - Can Large Language Models Play Text Games Well? Current State-of-the-Art
and Open Questions [22.669941641551823]
Large language models (LLMs) such as ChatGPT and GPT-4 have recently demonstrated their remarkable abilities of communicating with human users.
We take an initiative to investigate their capacities of playing text games, in which a player has to understand the environment and respond to situations by having dialogues with the game world.
Our experiments show that ChatGPT performs competitively compared to all the existing systems but still exhibits a low level of intelligence.
arXiv Detail & Related papers (2023-04-06T05:01:28Z) - Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion
Models [68.85478477006178]
We present a Promptable Game Model (PGM) for neural video game simulators.
It allows a user to play the game by prompting it with high- and low-level action sequences.
Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt.
Our method significantly outperforms existing neural video game simulators in terms of rendering quality and unlocks applications beyond the capabilities of the current state of the art.
arXiv Detail & Related papers (2023-03-23T17:43:17Z) - I Cast Detect Thoughts: Learning to Converse and Guide with Intents and
Theory-of-Mind in Dungeons and Dragons [82.28503603235364]
We study teacher-student natural language interactions in a goal-driven environment in Dungeons and Dragons.
Our approach is to decompose and model these interactions into (1) the Dungeon Master's intent to guide players toward a given goal; (2) the DM's guidance utterance to the players expressing this intent; and (3) a theory-of-mind (ToM) model that anticipates the players' reaction to the guidance one turn into the future.
arXiv Detail & Related papers (2022-12-20T08:06:55Z) - Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence [28.558934742150022]
We frame D&D as a dialogue system challenge, where the tasks are to both generate the next conversational turn in the game and predict the state of the game given the dialogue history.
We create a gameplay dataset consisting of nearly 900 games, with a total of 7,000 players, 800,000 dialogue turns, 500,000 dice rolls, and 58 million words.
We train a large language model (LM) to generate the next game turn, conditioning it on different information.
arXiv Detail & Related papers (2022-10-13T15:43:39Z) - Immersive Text Game and Personality Classification [1.9171404264679484]
Immersive Text Game allows the player to choose a story and a character, and interact with other characters in the story in an immersive manner.
The game is based on several latest models, including text generation language model, information extraction model, commonsense reasoning model, and psychology evaluation model.
arXiv Detail & Related papers (2022-03-20T18:37:03Z) - Keep CALM and Explore: Language Models for Action Generation in
Text-based Games [27.00685301984832]
We propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state.
We combine CALM with a reinforcement learning agent which re-ranks the generated action candidates to maximize in-game rewards.
arXiv Detail & Related papers (2020-10-06T17:36:29Z) - ConvAI3: Generating Clarifying Questions for Open-Domain Dialogue
Systems (ClariQ) [64.60303062063663]
This document presents a detailed description of the challenge on clarifying questions for dialogue systems (ClariQ)
The challenge is organized as part of the Conversational AI challenge series (ConvAI3) at Search Oriented Conversational AI (SCAI) EMNLP workshop in 2020.
arXiv Detail & Related papers (2020-09-23T19:48:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.