Related papers: Agent Planning with World Knowledge Model

Agent Planning with World Knowledge Model

URL: http://arxiv.org/abs/2405.14205v2
Date: Tue, 15 Oct 2024 13:58:17 GMT
Title: Agent Planning with World Knowledge Model
Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen,
Abstract summary: We introduce parametric World Knowledge Model (WKM) to facilitate agent planning. We develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Our method can achieve superior performance compared to various strong baselines.
Score: 88.4897773735576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ``real'' physical world. Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning. Concretely, we steer the agent model to self-synthesize knowledge from both expert and sampled trajectories. Then we develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines. Besides, we analyze to illustrate that our WKM can effectively alleviate the blind trial-and-error and hallucinatory action issues, providing strong support for the agent's understanding of the world. Other interesting findings include: 1) our instance-level task knowledge can generalize better to unseen tasks, 2) weak WKM can guide strong agent model planning, and 3) unified WKM training has promising potential for further development. The code is available at https://github.com/zjunlp/WKM.

Related papers

Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search [48.348209577994865]
Large Language Models (LLMs) are increasingly capable but often require significant guidance or extensive interaction history to perform effectively in complex, interactive environments.<n>We introduce a novel LLM agent framework that enhances planning capabilities through in-context learning.<n>Our agent learns to extract task-critical atomic facts'' from its interaction trajectories.
arXiv Detail & Related papers (2025-06-10T18:36:31Z)
Ego-centric Learning of Communicative World Models for Autonomous Driving [31.66608520780982]
We study multi-agent reinforcement learning (MARL) for tasks in complex high-dimensional environments, such as autonomous driving.<n>By making use of generative AI embodied in world model together with its latent representation, we develop it CALL, underlineCommunicunderlineative Worunderlineld Modeunderlinel.
arXiv Detail & Related papers (2025-06-09T18:56:40Z)
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios [53.26658545922884]
We introduce EgoPlan-Bench2, a benchmark designed to assess the planning capabilities of MLLMs across a wide range of real-world scenarios. We evaluate 21 competitive MLLMs and provide an in-depth analysis of their limitations, revealing that they face significant challenges in real-world planning. Our approach enhances the performance of GPT-4V by 10.24 on EgoPlan-Bench2 without additional training.
arXiv Detail & Related papers (2024-12-05T18:57:23Z)
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation [89.68433168477227]
Large Language Model (LLM) based agents have garnered significant attention and are becoming increasingly popular. This paper investigates enhancing the planning abilities of LLMs through instruction tuning. To address this limitation, this paper explores the automated synthesis of diverse environments and a gradual range of planning tasks.
arXiv Detail & Related papers (2024-08-01T17:59:46Z)
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z)
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents [54.09074527006576]
Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges. This inadequacy primarily stems from the lack of built-in action knowledge in language agents. We introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge.
arXiv Detail & Related papers (2024-03-05T16:39:12Z)
Language Models Meet World Models: Embodied Experiences Enhance Language Models [48.70726641605047]
Large language models (LMs) often struggle with simple reasoning and planning in physical environments. We propose a new paradigm of enhancing LMs by finetuning them with world models.
arXiv Detail & Related papers (2023-05-18T00:35:38Z)
On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark) [30.223130782579336]
We develop a benchmark suite based on the kinds of domains employed in the International Planning Competition. We evaluate LLMs in three modes: autonomous, human-in-the-loop and human-in-the-loop. Our results show that LLM's ability to autonomously generate executable plans is quite meager, averaging only about 3% success rate.
arXiv Detail & Related papers (2023-02-13T21:37:41Z)
Human-Timescale Adaptation in an Open-Ended Task Space [56.55530165036327]
We show that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans. Our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains.
arXiv Detail & Related papers (2023-01-18T15:39:21Z)
Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts [8.425194277824996]
We introduce a method to train soft prompts via self-supervised learning on data from knowledge bases. The resulting soft knowledge prompts (KPs) are task independent and work as an external memory of the LMs.
arXiv Detail & Related papers (2022-10-10T14:31:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.