AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game
- URL: http://arxiv.org/abs/2407.16521v2
- Date: Wed, 24 Jul 2024 15:12:09 GMT
- Title: AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game
- Authors: Yizhou Chi, Lingjun Mao, Zineng Tang,
- Abstract summary: This paper focuses on creating proxies of human behavior in simulated environments, with Among Us utilized as a tool for studying simulated human behavior.
Our work demonstrates that state-of-the-art large language models (LLMs) can effectively grasp the game rules and make decisions based on the current context.
- Score: 12.384945632524424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Strategic social deduction games serve as valuable testbeds for evaluating the understanding and inference skills of language models, offering crucial insights into social science, artificial intelligence, and strategic gaming. This paper focuses on creating proxies of human behavior in simulated environments, with Among Us utilized as a tool for studying simulated human behavior. The study introduces a text-based game environment, named AmongAgents, that mirrors the dynamics of Among Us. Players act as crew members aboard a spaceship, tasked with identifying impostors who are sabotaging the ship and eliminating the crew. Within this environment, the behavior of simulated language agents is analyzed. The experiments involve diverse game sequences featuring different configurations of Crewmates and Impostor personality archetypes. Our work demonstrates that state-of-the-art large language models (LLMs) can effectively grasp the game rules and make decisions based on the current context. This work aims to promote further exploration of LLMs in goal-oriented games with incomplete information and complex action spaces, as these settings offer valuable opportunities to assess language model performance in socially driven scenarios.
Related papers
- Understanding Players as if They Are Talking to the Game in a Customized Language: A Pilot Study [3.4333699338998693]
This pilot study explores the application of language models (LMs) to model game event sequences.
We transform raw event data into textual sequences and pretraining a Longformer model on this data.
The results demonstrate the potential of self-supervised LMs in enhancing game design and personalization without relying on ground-truth labels.
arXiv Detail & Related papers (2024-10-24T09:59:10Z) - Deciphering Digital Detectives: Understanding LLM Behaviors and
Capabilities in Multi-Agent Mystery Games [26.07074182316433]
We introduce the first dataset specifically for Jubensha, including character scripts and game rules.
Our work also presents a unique multi-agent interaction framework using LLMs, allowing AI agents to autonomously engage in this game.
To evaluate the gaming performance of these AI agents, we developed novel methods measuring their mastery of case information and reasoning skills.
arXiv Detail & Related papers (2023-12-01T17:33:57Z) - Leveraging Word Guessing Games to Assess the Intelligence of Large
Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy''
We develop DEEP to evaluate LLMs' expression and disguising abilities.
We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z) - Character-LLM: A Trainable Agent for Role-Playing [67.35139167985008]
Large language models (LLMs) can be used to serve as agents to simulate human behaviors.
We introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc.
arXiv Detail & Related papers (2023-10-16T07:58:56Z) - The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling
Probabilistic Social Inferences from Linguistic Inputs [50.32802502923367]
We study the process of language driving and influencing social reasoning in a probabilistic goal inference domain.
We propose a neuro-symbolic model that carries out goal inference from linguistic inputs of agent scenarios.
Our model closely matches human response patterns and better predicts human judgements than using an LLM alone.
arXiv Detail & Related papers (2023-06-25T19:38:01Z) - Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion
Behaviors in Social Deduction Games [45.55448048482881]
We introduce the first multimodal dataset for modeling persuasion behaviors.
Our dataset includes 199 dialogue transcriptions and videos, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes.
arXiv Detail & Related papers (2022-12-16T04:52:53Z) - Deep Reinforcement Learning with Stacked Hierarchical Attention for
Text-based Games [64.11746320061965]
We study reinforcement learning for text-based games, which are interactive simulations in the context of natural language.
We aim to conduct explicit reasoning with knowledge graphs for decision making, so that the actions of an agent are generated and supported by an interpretable inference procedure.
We extensively evaluate our method on a number of man-made benchmark games, and the experimental results demonstrate that our method performs better than existing text-based agents.
arXiv Detail & Related papers (2020-10-22T12:40:22Z) - Learning to Simulate Dynamic Environments with GameGAN [109.25308647431952]
In this paper, we aim to learn a simulator by simply watching an agent interact with an environment.
We introduce GameGAN, a generative model that learns to visually imitate a desired game by ingesting screenplay and keyboard actions during training.
arXiv Detail & Related papers (2020-05-25T14:10:17Z) - Exploration Based Language Learning for Text-Based Games [72.30525050367216]
This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games.
Text-based computer games describe their world to the player through natural language and expect the player to interact with the game using text.
These games are of interest as they can be seen as a testbed for language understanding, problem-solving, and language generation by artificial agents.
arXiv Detail & Related papers (2020-01-24T03:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.