Related papers: Building Cooperative Embodied Agents Modularly with Large Language Models

Building Cooperative Embodied Agents Modularly with Large Language Models

URL: http://arxiv.org/abs/2307.02485v2
Date: Sat, 17 Feb 2024 05:27:56 GMT
Title: Building Cooperative Embodied Agents Modularly with Large Language Models
Authors: Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan
Abstract summary: We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments. We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework. Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
Score: 104.57849816689559
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments. While previous research either presupposes a cost-free communication channel or relies on a centralized controller with shared observations, we harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework that integrates with perception, memory, and execution. Thus building a Cooperative Embodied Language Agent CoELA, who can plan, communicate, and cooperate with others to accomplish long-horizon tasks efficiently. Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication. Though current Open LMs like LLAMA-2 still underperform, we fine-tune a CoELA with data collected with our agents and show how they can achieve promising performance. We also conducted a user study for human-agent interaction and discovered that CoELA communicating in natural language can earn more trust and cooperate more effectively with humans. Our research underscores the potential of LLMs for future research in multi-agent cooperation. Videos can be found on the project website https://vis-www.cs.umass.edu/Co-LLM-Agents/.

Related papers

Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning [12.923902619187274]
This work studies how LLMs can adaptively collaborate to perform complex embodied reasoning tasks. MINDcraft is a platform built to enable LLM agents to control characters in the open-world game of Minecraft. An experimental study finds that the primary bottleneck in collaborating effectively for current state-of-the-art agents is efficient natural language communication.
arXiv Detail & Related papers (2025-04-24T21:28:16Z)
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [51.452664740963066]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments. We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions. Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z)
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task [56.92961847155029]
Theory of Mind (ToM) significantly impacts human collaboration and communication as a crucial capability to understand others. Mutual Theory of Mind (MToM) arises when AI agents with ToM capability collaborate with humans. We find that the agent's ToM capability does not significantly impact team performance but enhances human understanding of the agent.
arXiv Detail & Related papers (2024-09-13T13:19:48Z)
Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks World [13.005764902339523]
We design a blocks-world environment where two agents, each having unique goals and skills, build a target structure together. To complete the goals, they can act in the world and communicate in natural language. We adopt chain-of-thought prompts that include intermediate reasoning steps to model the partner's state and identify and correct execution errors.
arXiv Detail & Related papers (2024-03-30T04:48:38Z)
Embodied LLM Agents Learn to Cooperate in Organized Teams [46.331162216503344]
Large Language Models (LLMs) have emerged as integral tools for reasoning, planning, and decision-making. This paper introduces a framework that imposes prompt-based organization structures on LLM agents to mitigate these problems.
arXiv Detail & Related papers (2024-03-19T06:39:47Z)
Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game [25.823665278297057]
This study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language. Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication. To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning.
arXiv Detail & Related papers (2023-12-29T08:26:54Z)
Large Language Model Enhanced Multi-Agent Systems for 6G Communications [94.45712802626794]
We propose a multi-agent system with customized communication knowledge and tools for solving communication related tasks using natural language. We validate the effectiveness of the proposed multi-agent system by designing a semantic communication system.
arXiv Detail & Related papers (2023-12-13T02:35:57Z)
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing. As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework. This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z)
Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation [52.930183136111864]
We propose using scorable negotiation to evaluate Large Language Models (LLMs) To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities. We provide procedures to create new games and increase games' difficulty to have an evolving benchmark.
arXiv Detail & Related papers (2023-09-29T13:33:06Z)
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society [58.04479313658851]
This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents. We propose a novel communicative agent framework named role-playing. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems.
arXiv Detail & Related papers (2023-03-31T01:09:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.