Tachikuma: Understading Complex Interactions with Multi-Character and
Novel Objects by Large Language Models
- URL: http://arxiv.org/abs/2307.12573v1
- Date: Mon, 24 Jul 2023 07:40:59 GMT
- Title: Tachikuma: Understading Complex Interactions with Multi-Character and
Novel Objects by Large Language Models
- Authors: Yuanzhi Liang, Linchao Zhu, Yi Yang
- Abstract summary: We introduce a benchmark named Tachikuma, comprising a Multiple character and novel Object based interaction Estimation task and a supporting dataset.
The dataset captures log data from real-time communications during gameplay, providing diverse, grounded, and complex interactions for further explorations.
We present a simple prompting baseline and evaluate its performance, demonstrating its effectiveness in enhancing interaction understanding.
- Score: 67.20964015591262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in natural language and Large Language Models (LLMs) have
enabled AI agents to simulate human-like interactions within virtual worlds.
However, these interactions still face limitations in complexity and
flexibility, particularly in scenarios involving multiple characters and novel
objects. Pre-defining all interactable objects in the agent's world model
presents challenges, and conveying implicit intentions to multiple characters
through complex interactions remains difficult. To address these issues, we
propose integrating virtual Game Masters (GMs) into the agent's world model,
drawing inspiration from Tabletop Role-Playing Games (TRPGs). GMs play a
crucial role in overseeing information, estimating players' intentions,
providing environment descriptions, and offering feedback, compensating for
current world model deficiencies. To facilitate future explorations for complex
interactions, we introduce a benchmark named Tachikuma, comprising a Multiple
character and novel Object based interaction Estimation (MOE) task and a
supporting dataset. MOE challenges models to understand characters' intentions
and accurately determine their actions within intricate contexts involving
multi-character and novel object interactions. Besides, the dataset captures
log data from real-time communications during gameplay, providing diverse,
grounded, and complex interactions for further explorations. Finally, we
present a simple prompting baseline and evaluate its performance, demonstrating
its effectiveness in enhancing interaction understanding. We hope that our
dataset and task will inspire further research in complex interactions with
natural language, fostering the development of more advanced AI agents.
Related papers
- Versatile Motion Language Models for Multi-Turn Interactive Agents [28.736843383405603]
We introduce Versatile Interactive Motion language model, which integrates both language and motion modalities.
We evaluate the versatility of our method across motion-related tasks, motion to text, text to motion, reaction generation, motion editing, and reasoning about motion sequences.
arXiv Detail & Related papers (2024-10-08T02:23:53Z) - A Survey on Complex Tasks for Goal-Directed Interactive Agents [60.53915548970061]
This survey compiles relevant tasks and environments for evaluating goal-directed interactive agents.
An up-to-date compilation of relevant resources can be found on our project website.
arXiv Detail & Related papers (2024-09-27T08:17:53Z) - From Persona to Personalization: A Survey on Role-Playing Language Agents [52.783043059715546]
Recent advancements in large language models (LLMs) have boosted the rise of Role-Playing Language Agents (RPLAs)
RPLAs achieve a remarkable sense of human likeness and vivid role-playing performance.
They have catalyzed numerous AI applications, such as emotional companions, interactive video games, personalized assistants and copilots.
arXiv Detail & Related papers (2024-04-28T15:56:41Z) - PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games [18.383262467079078]
PLAYER* enhances path planning in Murder Mystery Games (MMGs) using an anytime sampling-based planner and a questioning-driven search framework.
By equipping agents with a set of sensors, PLAYER* eliminates the need for pre-defined questions and enables agents to navigate complex social interactions.
We additionally make a contribution by introducing a quantifiable evaluation method using multiple-choice questions and present WellPlay, a dataset containing 1,482 question-answer pairs.
arXiv Detail & Related papers (2024-04-26T19:07:30Z) - Scaling Instructable Agents Across Many Simulated Worlds [70.97268311053328]
Our goal is to develop an agent that can accomplish anything a human can do in any simulated 3D environment.
Our approach focuses on language-driven generality while imposing minimal assumptions.
Our agents interact with environments in real-time using a generic, human-like interface.
arXiv Detail & Related papers (2024-03-13T17:50:32Z) - LARP: Language-Agent Role Play for Open-World Games [19.80040627487576]
Language Agent for Role-Playing (LARP) is a cognitive architecture that encompasses memory processing and a decision-making assistant.
The framework refines interactions between users and agents, predefined with unique backgrounds and personalities.
It highlights the diverse uses of language models in a range of areas such as entertainment, education, and various simulation scenarios.
arXiv Detail & Related papers (2023-12-24T10:08:59Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - Knowledge-enhanced Agents for Interactive Text Games [16.055119735473017]
We propose a knowledge-injection framework for improved functional grounding of agents in text-based games.
We consider two forms of domain knowledge that we inject into learning-based agents: memory of previous correct actions and affordances of relevant objects in the environment.
Our framework supports two representative model classes: reinforcement learning agents and language model agents.
arXiv Detail & Related papers (2023-05-08T23:31:39Z) - Chat with the Environment: Interactive Multimodal Perception Using Large
Language Models [19.623070762485494]
Large Language Models (LLMs) have shown remarkable reasoning ability in few-shot robotic planning.
Our study demonstrates that LLMs can provide high-level planning and reasoning skills and control interactive robot behavior in a multimodal environment.
arXiv Detail & Related papers (2023-03-14T23:01:27Z) - SPA: Verbal Interactions between Agents and Avatars in Shared Virtual
Environments using Propositional Planning [61.335252950832256]
Sense-Plan-Ask, or SPA, generates plausible verbal interactions between virtual human-like agents and user avatars in shared virtual environments.
We find that our algorithm creates a small runtime cost and enables agents to complete their goals more effectively than agents without the ability to leverage natural-language communication.
arXiv Detail & Related papers (2020-02-08T23:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.