LLM+P: Empowering Large Language Models with Optimal Planning
Proficiency
- URL: http://arxiv.org/abs/2304.11477v3
- Date: Wed, 27 Sep 2023 07:29:44 GMT
- Title: LLM+P: Empowering Large Language Models with Optimal Planning
Proficiency
- Authors: Bo Liu and Yuqian Jiang and Xiaohan Zhang and Qiang Liu and Shiqi
Zhang and Joydeep Biswas and Peter Stone
- Abstract summary: Large language models (LLMs) have demonstrated remarkable zero-shot generalization abilities.
classical planners, once a problem is given in a formatted way, can use efficient search algorithms to quickly identify correct, or even optimal, plans.
This paper introduces LLM+P, the first framework that incorporates the strengths of classical planners into LLMs.
- Score: 46.20085545432116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have demonstrated remarkable zero-shot
generalization abilities: state-of-the-art chatbots can provide plausible
answers to many common questions that arise in daily life. However, so far,
LLMs cannot reliably solve long-horizon planning problems. By contrast,
classical planners, once a problem is given in a formatted way, can use
efficient search algorithms to quickly identify correct, or even optimal,
plans. In an effort to get the best of both worlds, this paper introduces
LLM+P, the first framework that incorporates the strengths of classical
planners into LLMs. LLM+P takes in a natural language description of a planning
problem, then returns a correct (or optimal) plan for solving that problem in
natural language. LLM+P does so by first converting the language description
into a file written in the planning domain definition language (PDDL), then
leveraging classical planners to quickly find a solution, and then translating
the found solution back into natural language. Along with LLM+P, we define a
diverse set of different benchmark problems taken from common planning
scenarios. Via a comprehensive set of experiments on these benchmark problems,
we find that LLM+P is able to provide optimal solutions for most problems,
while LLMs fail to provide even feasible plans for most problems.\footnote{The
code and results are publicly available at
https://github.com/Cranial-XIX/llm-pddl.git.
Related papers
- Stacking Small Language Models for Generalizability [0.0]
Large language models (LLMs) generalize strong performance across different natural language benchmarks.
This paper introduces a new approach called fine-tuning stacks of language models (FSLM)
By fine-tuning each SLM to perform a specific task, this approach breaks down high level reasoning into multiple lower-level steps that specific SLMs are responsible for.
As a result, FSLM allows for lower training and inference costs, and also improves model interpretability as each SLM communicates with the subsequent one through natural language.
arXiv Detail & Related papers (2024-10-21T01:27:29Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Turbulence: Systematically and Automatically Testing Instruction-Tuned
Large Language Models for Code [12.58098809948832]
We present a method for evaluating the correctness and robustness of instruction-tuned large language models (LLMs) for code generation via a new benchmark, Turbulence.
Turbulence consists of a large set of natural language $textitquestion templates$, each of which is a programming problem, parameterised so that it can be asked in many different forms.
From a single question template, it is possible to ask an LLM a $textitneighbourhood$ of very similar programming questions, and assess the correctness of the result returned for each question.
arXiv Detail & Related papers (2023-12-22T17:29:08Z) - Understanding the Capabilities of Large Language Models for Automated
Planning [24.37599752610625]
The study seeks to shed light on the capabilities of LLMs in solving complex planning problems.
It provides insights into the most effective approaches for using LLMs in this context.
arXiv Detail & Related papers (2023-05-25T15:21:09Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - Learning to Plan with Natural Language [111.76828049344839]
Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks.
For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step.
We propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.
arXiv Detail & Related papers (2023-04-20T17:09:12Z) - Translating Natural Language to Planning Goals with Large-Language
Models [19.738395237639136]
Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks.
Our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language.
Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning.
arXiv Detail & Related papers (2023-02-10T09:17:52Z) - PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems.
PaL offloads the solution step to a programmatic runtime such as a Python interpreter.
We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.