LLM-Powered Hierarchical Language Agent for Real-time Human-AI
Coordination
- URL: http://arxiv.org/abs/2312.15224v2
- Date: Tue, 9 Jan 2024 06:23:44 GMT
- Title: LLM-Powered Hierarchical Language Agent for Real-time Human-AI
Coordination
- Authors: Jijia Liu, Chao Yu, Jiaxuan Gao, Yuqing Xie, Qingmin Liao, Yi Wu, Yu
Wang
- Abstract summary: We propose a Hierarchical Language Agent (HLA) for human-AI coordination.
HLA provides both strong reasoning abilities while keeping real-time execution.
Human studies show that HLA outperforms other baseline agents, including slow-mind-only agents and fast-mind-only agents.
- Score: 28.22553394518179
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI agents powered by Large Language Models (LLMs) have made significant
advances, enabling them to assist humans in diverse complex tasks and leading
to a revolution in human-AI coordination. LLM-powered agents typically require
invoking LLM APIs and employing artificially designed complex prompts, which
results in high inference latency. While this paradigm works well in scenarios
with minimal interactive demands, such as code generation, it is unsuitable for
highly interactive and real-time applications, such as gaming. Traditional
gaming AI often employs small models or reactive policies, enabling fast
inference but offering limited task completion and interaction abilities. In
this work, we consider Overcooked as our testbed where players could
communicate with natural language and cooperate to serve orders. We propose a
Hierarchical Language Agent (HLA) for human-AI coordination that provides both
strong reasoning abilities while keeping real-time execution. In particular,
HLA adopts a hierarchical framework and comprises three modules: a proficient
LLM, referred to as Slow Mind, for intention reasoning and language
interaction, a lightweight LLM, referred to as Fast Mind, for generating macro
actions, and a reactive policy, referred to as Executor, for transforming macro
actions into atomic actions. Human studies show that HLA outperforms other
baseline agents, including slow-mind-only agents and fast-mind-only agents,
with stronger cooperation abilities, faster responses, and more consistent
language communications.
Related papers
- Two Heads Are Better Than One: Collaborative LLM Embodied Agents for Human-Robot Interaction [1.6574413179773757]
Large language models (LLMs) should be able to leverage their large breadth of understanding to interpret natural language commands.
However, these models suffer from hallucinations, which may cause safety issues or deviations from the task.
In this research, multiple collaborative AI systems were tested against a single independent AI agent to determine whether the success in other domains would translate into improved human-robot interaction performance.
arXiv Detail & Related papers (2024-11-23T02:47:12Z) - Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents.
We propose the Internet of Agents (IoA), a novel framework that addresses these limitations.
IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z) - Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization [53.510942601223626]
Large Language Models (LLMs) exhibit robust problem-solving capabilities for diverse tasks.
These task solvers necessitate manually crafted prompts to inform task rules and regulate behaviors.
We propose Agent-Pro: an LLM-based Agent with Policy-level Reflection and Optimization.
arXiv Detail & Related papers (2024-02-27T15:09:20Z) - Procedural Adherence and Interpretability Through Neuro-Symbolic Generative Agents [0.9886108751871757]
We propose a combination of formal logic-based program synthesis and LLM content generation to bring guarantees of procedural adherence and interpretability to generative agent behavior.
To illustrate the benefit of procedural adherence and interpretability, we use Temporal Stream Logic (TSL) to generate an automaton that enforces an interpretable, high-level temporal structure on an agent.
arXiv Detail & Related papers (2024-02-24T21:36:26Z) - LLMind: Orchestrating AI and IoT with LLM for Complex Task Execution [18.816077341295628]
We present LLMind, a task-oriented AI framework that enables effective collaboration among IoT devices.
Inspired by the functional specialization theory of the brain, our framework integrates an LLM with domain-specific AI modules.
Complex tasks, which may involve collaborations of multiple domain-specific AI modules and IoT devices, are executed through a control script.
arXiv Detail & Related papers (2023-12-14T14:57:58Z) - MetaAgents: Simulating Interactions of Human Behaviors for LLM-based
Task-oriented Coordination via Collaborative Generative Agents [27.911816995891726]
We introduce collaborative generative agents, endowing LLM-based Agents with consistent behavior patterns and task-solving abilities.
We propose a novel framework that equips collaborative generative agents with human-like reasoning abilities and specialized skills.
Our work provides valuable insights into the role and evolution of Large Language Models in task-oriented social simulations.
arXiv Detail & Related papers (2023-10-10T10:17:58Z) - Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation [52.930183136111864]
We propose using scorable negotiation to evaluate Large Language Models (LLMs)
To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities.
We provide procedures to create new games and increase games' difficulty to have an evolving benchmark.
arXiv Detail & Related papers (2023-09-29T13:33:06Z) - The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI)
We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents.
We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z) - Building Cooperative Embodied Agents Modularly with Large Language
Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z) - Neuro-Symbolic Causal Language Planning with Commonsense Prompting [67.06667162430118]
Language planning aims to implement complex high-level goals by decomposition into simpler low-level steps.
Previous methods require either manual exemplars or annotated programs to acquire such ability from large language models.
This paper proposes Neuro-Symbolic Causal Language Planner (CLAP) that elicits procedural knowledge from the LLMs with commonsense-infused prompting.
arXiv Detail & Related papers (2022-06-06T22:09:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.