LARP: Language-Agent Role Play for Open-World Games
- URL: http://arxiv.org/abs/2312.17653v1
- Date: Sun, 24 Dec 2023 10:08:59 GMT
- Title: LARP: Language-Agent Role Play for Open-World Games
- Authors: Ming Yan, Ruihao Li, Hao Zhang, Hao Wang, Zhilan Yang, Ji Yan
- Abstract summary: Language Agent for Role-Playing (LARP) is a cognitive architecture that encompasses memory processing and a decision-making assistant.
The framework refines interactions between users and agents, predefined with unique backgrounds and personalities.
It highlights the diverse uses of language models in a range of areas such as entertainment, education, and various simulation scenarios.
- Score: 19.80040627487576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language agents have shown impressive problem-solving skills within defined
settings and brief timelines. Yet, with the ever-evolving complexities of
open-world simulations, there's a pressing need for agents that can flexibly
adapt to complex environments and consistently maintain a long-term memory to
ensure coherent actions. To bridge the gap between language agents and
open-world games, we introduce Language Agent for Role-Playing (LARP), which
includes a cognitive architecture that encompasses memory processing and a
decision-making assistant, an environment interaction module with a
feedback-driven learnable action space, and a postprocessing method that
promotes the alignment of various personalities. The LARP framework refines
interactions between users and agents, predefined with unique backgrounds and
personalities, ultimately enhancing the gaming experience in open-world
contexts. Furthermore, it highlights the diverse uses of language models in a
range of areas such as entertainment, education, and various simulation
scenarios. The project page is released at https://miao-ai-lab.github.io/LARP/.
Related papers
- Autonomous Workflow for Multimodal Fine-Grained Training Assistants Towards Mixed Reality [28.27036270001756]
This work designs an autonomous workflow tailored for integrating AI agents seamlessly into extended reality (XR) applications for fine-grained training.
We present a demonstration of a multimodal fine-grained training assistant for LEGO brick assembly in a pilot XR environment.
arXiv Detail & Related papers (2024-05-16T14:20:30Z) - Scaling Instructable Agents Across Many Simulated Worlds [70.97268311053328]
Our goal is to develop an agent that can accomplish anything a human can do in any simulated 3D environment.
Our approach focuses on language-driven generality while imposing minimal assumptions.
Our agents interact with environments in real-time using a generic, human-like interface.
arXiv Detail & Related papers (2024-03-13T17:50:32Z) - MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments [82.67236400004826]
We introduce the Multimodal Embodied Interactive Agent (MEIA), capable of translating high-level tasks expressed in natural language into a sequence of executable actions.
MEM module enables MEIA to generate executable action plans based on diverse requirements and the robot's capabilities.
arXiv Detail & Related papers (2024-02-01T02:43:20Z) - Learning to Model the World with Language [100.76069091703505]
To interact with humans and act in the world, agents need to understand the range of language that people use and relate it to the visual world.
Our key idea is that agents should interpret such diverse language as a signal that helps them predict the future.
We instantiate this in Dynalang, an agent that learns a multimodal world model to predict future text and image representations.
arXiv Detail & Related papers (2023-07-31T17:57:49Z) - Tachikuma: Understading Complex Interactions with Multi-Character and
Novel Objects by Large Language Models [67.20964015591262]
We introduce a benchmark named Tachikuma, comprising a Multiple character and novel Object based interaction Estimation task and a supporting dataset.
The dataset captures log data from real-time communications during gameplay, providing diverse, grounded, and complex interactions for further explorations.
We present a simple prompting baseline and evaluate its performance, demonstrating its effectiveness in enhancing interaction understanding.
arXiv Detail & Related papers (2023-07-24T07:40:59Z) - Inner Monologue: Embodied Reasoning through Planning with Language
Models [81.07216635735571]
Large Language Models (LLMs) can be applied to domains beyond natural language processing.
LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them.
We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios.
arXiv Detail & Related papers (2022-07-12T15:20:48Z) - VisualHints: A Visual-Lingual Environment for Multimodal Reinforcement
Learning [14.553086325168803]
We present VisualHints, a novel environment for multimodal reinforcement learning (RL) involving text-based interactions along with visual hints (obtained from the environment)
We introduce an extension of the TextWorld cooking environment with the addition of visual clues interspersed throughout the environment.
The goal is to force an RL agent to use both text and visual features to predict natural language action commands for solving the final task of cooking a meal.
arXiv Detail & Related papers (2020-10-26T18:51:02Z) - Learning to Simulate Dynamic Environments with GameGAN [109.25308647431952]
In this paper, we aim to learn a simulator by simply watching an agent interact with an environment.
We introduce GameGAN, a generative model that learns to visually imitate a desired game by ingesting screenplay and keyboard actions during training.
arXiv Detail & Related papers (2020-05-25T14:10:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.