CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration
- URL: http://arxiv.org/abs/2406.13381v1
- Date: Wed, 19 Jun 2024 09:23:53 GMT
- Title: CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration
- Authors: Xinming Hou, Mingming Yang, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Wayne Xin Zhao,
- Abstract summary: Existing LLMs exhibit remarkable performance on various NLP tasks, but still struggle with complex real-world tasks.
We propose the CoAct framework, which transfers the hierarchical planning and collaboration patterns in human society to LLM systems.
- Score: 87.51781348070914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing LLMs exhibit remarkable performance on various NLP tasks, but still struggle with complex real-world tasks, even equipped with advanced strategies like CoT and ReAct. In this work, we propose the CoAct framework, which transfers the hierarchical planning and collaboration patterns in human society to LLM systems. Specifically, our CoAct framework involves two agents: (1) A global planning agent, to comprehend the problem scope, formulate macro-level plans and provide detailed sub-task descriptions to local execution agents, which serves as the initial rendition of a global plan. (2) A local execution agent, to operate within the multi-tier task execution structure, focusing on detailed execution and implementation of specific tasks within the global plan. Experimental results on the WebArena benchmark show that CoAct can re-arrange the process trajectory when facing failures, and achieves superior performance over baseline methods on long-horizon web tasks. Code is available at https://github.com/xmhou2002/CoAct.
Related papers
- Meta-Task Planning for Language Agents [13.550774629515843]
Large language model-based agents (LLM agents) have emerged as a promising paradigm for achieving artificial general intelligence (AGI)
This paper introduces Meta-Task Planning (MTP), a zero-shot methodology for collaborative LLM-based multi-agent systems.
MTP achieved an average $sim40%$ success rate on TravelPlanner, significantly higher than the state-of-the-art (SOTA) baseline.
arXiv Detail & Related papers (2024-05-26T10:33:17Z) - RL-GPT: Integrating Reinforcement Learning and Code-as-policy [82.1804241891039]
We introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.
The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks.
This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline.
arXiv Detail & Related papers (2024-02-29T16:07:22Z) - Learning adaptive planning representations with natural language
guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations.
Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z) - Agents meet OKR: An Object and Key Results Driven Agent System with
Hierarchical Self-Collaboration and Self-Evaluation [25.308341461293857]
OKR-Agent is designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.
Our framework includes two novel modules: hierarchical Objects and Key Results generation and multi-level evaluation.
arXiv Detail & Related papers (2023-11-28T06:16:30Z) - ADaPT: As-Needed Decomposition and Planning with Language Models [131.063805299796]
We introduce As-Needed Decomposition and Planning for complex Tasks (ADaPT)
ADaPT explicitly plans and decomposes complex sub-tasks as-needed, when the Large Language Models is unable to execute them.
Our results demonstrate that ADaPT substantially outperforms established strong baselines.
arXiv Detail & Related papers (2023-11-08T17:59:15Z) - Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint.
During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations.
Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z) - ALMA: Hierarchical Learning for Composite Multi-Agent Tasks [21.556661319375255]
We introduce ALMA, a general learning method for taking advantage of structured tasks.
ALMA simultaneously learns a high-level subtask allocation policy and low-level agent policies.
We demonstrate that ALMA learns sophisticated coordination behavior in a number of challenging environments.
arXiv Detail & Related papers (2022-05-27T19:12:23Z) - Procedures as Programs: Hierarchical Control of Situated Agents through
Natural Language [81.73820295186727]
We propose a formalism of procedures as programs, a powerful yet intuitive method of representing hierarchical procedural knowledge for agent command and control.
We instantiate this framework on the IQA and ALFRED datasets for NL instruction following.
arXiv Detail & Related papers (2021-09-16T20:36:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.