Related papers: LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

URL: http://arxiv.org/abs/2304.11477v3
Date: Wed, 27 Sep 2023 07:29:44 GMT
Title: LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
Authors: Bo Liu and Yuqian Jiang and Xiaohan Zhang and Qiang Liu and Shiqi Zhang and Joydeep Biswas and Peter Stone
Abstract summary: Large language models (LLMs) have demonstrated remarkable zero-shot generalization abilities. classical planners, once a problem is given in a formatted way, can use efficient search algorithms to quickly identify correct, or even optimal, plans. This paper introduces LLM+P, the first framework that incorporates the strengths of classical planners into LLMs.
Score: 46.20085545432116
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have demonstrated remarkable zero-shot generalization abilities: state-of-the-art chatbots can provide plausible answers to many common questions that arise in daily life. However, so far, LLMs cannot reliably solve long-horizon planning problems. By contrast, classical planners, once a problem is given in a formatted way, can use efficient search algorithms to quickly identify correct, or even optimal, plans. In an effort to get the best of both worlds, this paper introduces LLM+P, the first framework that incorporates the strengths of classical planners into LLMs. LLM+P takes in a natural language description of a planning problem, then returns a correct (or optimal) plan for solving that problem in natural language. LLM+P does so by first converting the language description into a file written in the planning domain definition language (PDDL), then leveraging classical planners to quickly find a solution, and then translating the found solution back into natural language. Along with LLM+P, we define a diverse set of different benchmark problems taken from common planning scenarios. Via a comprehensive set of experiments on these benchmark problems, we find that LLM+P is able to provide optimal solutions for most problems, while LLMs fail to provide even feasible plans for most problems.\footnote{The code and results are publicly available at https://github.com/Cranial-XIX/llm-pddl.git.

Related papers

Large Language Models as Common-Sense Heuristics [0.9093413254392775]
Large Language Models (LLMs) possess parametrised knowledge across a wide range of topics, enabling them to leverage the natural language descriptions of planning tasks in their solutions. We introduce a novel planning method, which leverages the parametrised knowledge of LLMs by using their output as a for Hill-Climbing Search. Our method outperforms the task success rate of similar systems within a common household environment by 22 percentage points, with consistently executable plans.
arXiv Detail & Related papers (2025-01-31T00:26:38Z)
Query-Efficient Planning with Language Models [8.136901056728945]
Planning in complex environments requires an agent to efficiently query a world model to find a sequence of actions from start to goal. Recent work has shown that Large Language Models (LLMs) can potentially help with planning by searching over promising states and adapting to feedback from the world. We show that while both approaches improve upon comparable baselines, using an LLM as a generative planner results in significantly fewer interactions.
arXiv Detail & Related papers (2024-12-09T02:51:21Z)
Stacking Small Language Models for Generalizability [0.0]
Large language models (LLMs) generalize strong performance across different natural language benchmarks. This paper introduces a new approach called fine-tuning stacks of language models (FSLM) By fine-tuning each SLM to perform a specific task, this approach breaks down high level reasoning into multiple lower-level steps that specific SLMs are responsible for. As a result, FSLM allows for lower training and inference costs, and also improves model interpretability as each SLM communicates with the subsequent one through natural language.
arXiv Detail & Related papers (2024-10-21T01:27:29Z)
Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs) We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios. We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z)
Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z)
Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code [12.58098809948832]
We present a method for evaluating the correctness and robustness of instruction-tuned large language models (LLMs) for code generation via a new benchmark, Turbulence. Turbulence consists of a large set of natural language $textitquestion templates$, each of which is a programming problem, parameterised so that it can be asked in many different forms. From a single question template, it is possible to ask an LLM a $textitneighbourhood$ of very similar programming questions, and assess the correctness of the result returned for each question.
arXiv Detail & Related papers (2023-12-22T17:29:08Z)
Understanding the Capabilities of Large Language Models for Automated Planning [24.37599752610625]
The study seeks to shed light on the capabilities of LLMs in solving complex planning problems. It provides insights into the most effective approaches for using LLMs in this context.
arXiv Detail & Related papers (2023-05-25T15:21:09Z)
LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)
Learning to Plan with Natural Language [111.76828049344839]
Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks. For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step. We propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.
arXiv Detail & Related papers (2023-04-20T17:09:12Z)
Translating Natural Language to Planning Goals with Large-Language Models [19.738395237639136]
Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks. Our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language. Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning.
arXiv Detail & Related papers (2023-02-10T09:17:52Z)
PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems. PaL offloads the solution step to a programmatic runtime such as a Python interpreter. We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.