Revealing the Barriers of Language Agents in Planning
- URL: http://arxiv.org/abs/2410.12409v1
- Date: Wed, 16 Oct 2024 09:44:38 GMT
- Title: Revealing the Barriers of Language Agents in Planning
- Authors: Jian Xie, Kexun Zhang, Jiangjie Chen, Siyu Yuan, Kai Zhang, Yikai Zhang, Lei Li, Yanghua Xiao,
- Abstract summary: We show that current language agents still lack human-level planning abilities.
Even the state-of-the-art reasoning model, OpenAI o1, achieves only 15.6% on one of the complex real-world planning benchmarks.
We identify two key factors that hinder agent planning: the limited role of constraints and the diminishing influence of questions.
- Score: 44.913745512049246
- License:
- Abstract: Autonomous planning has been an ongoing pursuit since the inception of artificial intelligence. Based on curated problem solvers, early planning agents could deliver precise solutions for specific tasks but lacked generalization. The emergence of large language models (LLMs) and their powerful reasoning capabilities has reignited interest in autonomous planning by automatically generating reasonable solutions for given tasks. However, prior research and our experiments show that current language agents still lack human-level planning abilities. Even the state-of-the-art reasoning model, OpenAI o1, achieves only 15.6% on one of the complex real-world planning benchmarks. This highlights a critical question: What hinders language agents from achieving human-level planning? Although existing studies have highlighted weak performance in agent planning, the deeper underlying issues and the mechanisms and limitations of the strategies proposed to address them remain insufficiently understood. In this work, we apply the feature attribution study and identify two key factors that hinder agent planning: the limited role of constraints and the diminishing influence of questions. We also find that although current strategies help mitigate these challenges, they do not fully resolve them, indicating that agents still have a long way to go before reaching human-level intelligence.
Related papers
- LASP: Surveying the State-of-the-Art in Large Language Model-Assisted AI Planning [7.36760703426119]
This survey aims to highlight the existing challenges in planning with language models.
It focuses on key areas such as embodied environments, optimal scheduling, competitive and cooperative games, task decomposition, reasoning, and planning.
arXiv Detail & Related papers (2024-09-03T11:39:52Z) - WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence.
WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z) - Ask-before-Plan: Proactive Language Agents for Real-World Planning [68.08024918064503]
Proactive Agent Planning requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction.
We propose a novel multi-agent framework, Clarification-Execution-Planning (textttCEP), which consists of three agents specialized in clarification, execution, and planning.
arXiv Detail & Related papers (2024-06-18T14:07:28Z) - KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents [54.09074527006576]
Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges.
This inadequacy primarily stems from the lack of built-in action knowledge in language agents.
We introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge.
arXiv Detail & Related papers (2024-03-05T16:39:12Z) - TravelPlanner: A Benchmark for Real-World Planning with Language Agents [63.199454024966506]
We propose TravelPlanner, a new planning benchmark that focuses on travel planning, a common real-world planning scenario.
It provides a rich sandbox environment, various tools for accessing nearly four million data records, and 1,225 meticulously curated planning intents and reference plans.
Comprehensive evaluations show that the current language agents are not yet capable of handling such complex planning tasks-even GPT-4 only achieves a success rate of 0.6%.
arXiv Detail & Related papers (2024-02-02T18:39:51Z) - Efficient Multi-agent Epistemic Planning: Teaching Planners About Nested
Belief [27.524600740450126]
We plan from the perspective of a single agent with the potential for goals and actions that involve nested beliefs, non-homogeneous agents, co-present observations, and the ability for one agent to reason as if it were another.
Our approach represents an important step towards applying the well-established field of automated planning to the challenging task of planning involving nested beliefs of multiple agents.
arXiv Detail & Related papers (2021-10-06T03:24:01Z) - Comprehensive Multi-Agent Epistemic Planning [0.0]
This manuscript is focused on a specialized kind of planning known as Multi-agent Epistemic Planning (MEP).
EP refers to an automated planning setting where the agent reasons in the space of knowledge/beliefs states and tries to find a plan to reach a desirable state from a starting one.
Its general form, the MEP problem, involves multiple agents who need to reason about both the state of the world and the information flows between agents.
arXiv Detail & Related papers (2021-09-17T01:50:18Z) - Collaborative Human-Agent Planning for Resilience [5.2123460114614435]
We investigate whether people can collaborate with agents by providing their knowledge to an agent using linear temporal logic (LTL) at run-time.
We present 24 participants with baseline plans for situations in which a planner had limitations, and asked the participants for workarounds for these limitations.
Results show that participants' constraints improved the expected return of the plans by 10%.
arXiv Detail & Related papers (2021-04-29T03:21:31Z) - Modelling Multi-Agent Epistemic Planning in ASP [66.76082318001976]
This paper presents an implementation of a multi-shot Answer Set Programming-based planner that can reason in multi-agent epistemic settings.
The paper shows how the planner, exploiting an ad-hoc epistemic state representation and the efficiency of ASP solvers, has competitive performance results on benchmarks collected from the literature.
arXiv Detail & Related papers (2020-08-07T06:35:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.