Deciphering Digital Detectives: Understanding LLM Behaviors and
Capabilities in Multi-Agent Mystery Games
- URL: http://arxiv.org/abs/2312.00746v2
- Date: Thu, 29 Feb 2024 06:24:28 GMT
- Title: Deciphering Digital Detectives: Understanding LLM Behaviors and
Capabilities in Multi-Agent Mystery Games
- Authors: Dekun Wu, Haochen Shi, Zhiyuan Sun, Bang Liu
- Abstract summary: We introduce the first dataset specifically for Jubensha, including character scripts and game rules.
Our work also presents a unique multi-agent interaction framework using LLMs, allowing AI agents to autonomously engage in this game.
To evaluate the gaming performance of these AI agents, we developed novel methods measuring their mastery of case information and reasoning skills.
- Score: 26.07074182316433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we explore the application of Large Language Models (LLMs) in
\textit{Jubensha}, a Chinese detective role-playing game and a novel area in
Artificial Intelligence (AI) driven gaming. We introduce the first dataset
specifically for Jubensha, including character scripts and game rules, to
foster AI agent development in this complex narrative environment. Our work
also presents a unique multi-agent interaction framework using LLMs, allowing
AI agents to autonomously engage in this game. To evaluate the gaming
performance of these AI agents, we developed novel methods measuring their
mastery of case information and reasoning skills. Furthermore, we incorporated
the latest advancements in in-context learning to improve the agents'
performance in information gathering, murderer identification, and logical
reasoning. The experimental results validate the effectiveness of our proposed
methods. This work aims to offer a novel perspective on understanding LLM
capabilities and establish a new benchmark for evaluating large language
model-based agents.
Related papers
- Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information [36.11862095329315]
Large language models (LLMs) have shown success in handling simple games with imperfect information.
This study investigates the applicability of knowledge acquired by open-source and API-based LLMs to sophisticated text-based games.
arXiv Detail & Related papers (2024-08-05T15:36:46Z) - From Persona to Personalization: A Survey on Role-Playing Language Agents [52.783043059715546]
Recent advancements in large language models (LLMs) have boosted the rise of Role-Playing Language Agents (RPLAs)
RPLAs achieve a remarkable sense of human likeness and vivid role-playing performance.
They have catalyzed numerous AI applications, such as emotional companions, interactive video games, personalized assistants and copilots.
arXiv Detail & Related papers (2024-04-28T15:56:41Z) - A Survey on Large Language Model-Based Game Agents [9.892954815419452]
The development of game agents holds a critical role in advancing towards Artificial General Intelligence (AGI)
This paper provides a comprehensive overview of LLM-based game agents from a holistic viewpoint.
arXiv Detail & Related papers (2024-04-02T15:34:18Z) - Characteristic AI Agents via Large Language Models [40.10858767752735]
This research focuses on investigating the performance of Large Language Models in constructing characteristic AI agents.
A dataset called Character100'' is built for this benchmark, comprising the most-visited people on Wikipedia for language models to role-play.
The experimental results underscore the potential directions for further improvement in the capabilities of LLMs in constructing characteristic AI agents.
arXiv Detail & Related papers (2024-03-19T02:25:29Z) - Leveraging Word Guessing Games to Assess the Intelligence of Large
Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy''
We develop DEEP to evaluate LLMs' expression and disguising abilities.
We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z) - LLM-Based Agent Society Investigation: Collaboration and Confrontation in Avalon Gameplay [55.12945794835791]
Using Avalon as a testbed, we employ system prompts to guide LLM agents in gameplay.
We propose a novel framework, tailored for Avalon, features a multi-agent system facilitating efficient communication and interaction.
Results affirm the framework's effectiveness in creating adaptive agents and suggest LLM-based agents' potential in navigating dynamic social interactions.
arXiv Detail & Related papers (2023-10-23T14:35:26Z) - An In-depth Survey of Large Language Model-based Artificial Intelligence
Agents [11.774961923192478]
We have explored the core differences and characteristics between LLM-based AI agents and traditional AI agents.
We conducted an in-depth analysis of the key components of AI agents, including planning, memory, and tool use.
arXiv Detail & Related papers (2023-09-23T11:25:45Z) - The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI)
We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents.
We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z) - Tachikuma: Understading Complex Interactions with Multi-Character and
Novel Objects by Large Language Models [67.20964015591262]
We introduce a benchmark named Tachikuma, comprising a Multiple character and novel Object based interaction Estimation task and a supporting dataset.
The dataset captures log data from real-time communications during gameplay, providing diverse, grounded, and complex interactions for further explorations.
We present a simple prompting baseline and evaluate its performance, demonstrating its effectiveness in enhancing interaction understanding.
arXiv Detail & Related papers (2023-07-24T07:40:59Z) - Deep Reinforcement Learning with Stacked Hierarchical Attention for
Text-based Games [64.11746320061965]
We study reinforcement learning for text-based games, which are interactive simulations in the context of natural language.
We aim to conduct explicit reasoning with knowledge graphs for decision making, so that the actions of an agent are generated and supported by an interpretable inference procedure.
We extensively evaluate our method on a number of man-made benchmark games, and the experimental results demonstrate that our method performs better than existing text-based agents.
arXiv Detail & Related papers (2020-10-22T12:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.