Building Cooperative Embodied Agents Modularly with Large Language
Models
- URL: http://arxiv.org/abs/2307.02485v2
- Date: Sat, 17 Feb 2024 05:27:56 GMT
- Title: Building Cooperative Embodied Agents Modularly with Large Language
Models
- Authors: Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua
B. Tenenbaum, Tianmin Shu, Chuang Gan
- Abstract summary: We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
- Score: 104.57849816689559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we address challenging multi-agent cooperation problems with
decentralized control, raw sensory observations, costly communication, and
multi-objective tasks instantiated in various embodied environments. While
previous research either presupposes a cost-free communication channel or
relies on a centralized controller with shared observations, we harness the
commonsense knowledge, reasoning ability, language comprehension, and text
generation prowess of LLMs and seamlessly incorporate them into a
cognitive-inspired modular framework that integrates with perception, memory,
and execution. Thus building a Cooperative Embodied Language Agent CoELA, who
can plan, communicate, and cooperate with others to accomplish long-horizon
tasks efficiently. Our experiments on C-WAH and TDW-MAT demonstrate that CoELA
driven by GPT-4 can surpass strong planning-based methods and exhibit emergent
effective communication. Though current Open LMs like LLAMA-2 still
underperform, we fine-tune a CoELA with data collected with our agents and show
how they can achieve promising performance. We also conducted a user study for
human-agent interaction and discovered that CoELA communicating in natural
language can earn more trust and cooperate more effectively with humans. Our
research underscores the potential of LLMs for future research in multi-agent
cooperation. Videos can be found on the project website
https://vis-www.cs.umass.edu/Co-LLM-Agents/.
Related papers
- Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task [56.92961847155029]
Theory of Mind (ToM) significantly impacts human collaboration and communication as a crucial capability to understand others.
Mutual Theory of Mind (MToM) arises when AI agents with ToM capability collaborate with humans.
We find that the agent's ToM capability does not significantly impact team performance but enhances human understanding of the agent.
arXiv Detail & Related papers (2024-09-13T13:19:48Z) - Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks World [13.005764902339523]
We design a blocks-world environment where two agents, each having unique goals and skills, build a target structure together.
To complete the goals, they can act in the world and communicate in natural language.
We adopt chain-of-thought prompts that include intermediate reasoning steps to model the partner's state and identify and correct execution errors.
arXiv Detail & Related papers (2024-03-30T04:48:38Z) - Embodied LLM Agents Learn to Cooperate in Organized Teams [46.331162216503344]
Large Language Models (LLMs) have emerged as integral tools for reasoning, planning, and decision-making.
This paper introduces a framework that imposes prompt-based organization structures on LLM agents to mitigate these problems.
arXiv Detail & Related papers (2024-03-19T06:39:47Z) - Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in
the Avalon Game [25.823665278297057]
This study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language.
Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication.
To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning.
arXiv Detail & Related papers (2023-12-29T08:26:54Z) - Large Language Model Enhanced Multi-Agent Systems for 6G Communications [94.45712802626794]
We propose a multi-agent system with customized communication knowledge and tools for solving communication related tasks using natural language.
We validate the effectiveness of the proposed multi-agent system by designing a semantic communication system.
arXiv Detail & Related papers (2023-12-13T02:35:57Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation [52.930183136111864]
We propose using scorable negotiation to evaluate Large Language Models (LLMs)
To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities.
We provide procedures to create new games and increase games' difficulty to have an evolving benchmark.
arXiv Detail & Related papers (2023-09-29T13:33:06Z) - CAMEL: Communicative Agents for "Mind" Exploration of Large Language
Model Society [58.04479313658851]
This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents.
We propose a novel communicative agent framework named role-playing.
Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems.
arXiv Detail & Related papers (2023-03-31T01:09:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.