LLM+P: Empowering Large Language Models with Optimal Planning
Proficiency
- URL: http://arxiv.org/abs/2304.11477v3
- Date: Wed, 27 Sep 2023 07:29:44 GMT
- Title: LLM+P: Empowering Large Language Models with Optimal Planning
Proficiency
- Authors: Bo Liu and Yuqian Jiang and Xiaohan Zhang and Qiang Liu and Shiqi
Zhang and Joydeep Biswas and Peter Stone
- Abstract summary: Large language models (LLMs) have demonstrated remarkable zero-shot generalization abilities.
classical planners, once a problem is given in a formatted way, can use efficient search algorithms to quickly identify correct, or even optimal, plans.
This paper introduces LLM+P, the first framework that incorporates the strengths of classical planners into LLMs.
- Score: 46.20085545432116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have demonstrated remarkable zero-shot
generalization abilities: state-of-the-art chatbots can provide plausible
answers to many common questions that arise in daily life. However, so far,
LLMs cannot reliably solve long-horizon planning problems. By contrast,
classical planners, once a problem is given in a formatted way, can use
efficient search algorithms to quickly identify correct, or even optimal,
plans. In an effort to get the best of both worlds, this paper introduces
LLM+P, the first framework that incorporates the strengths of classical
planners into LLMs. LLM+P takes in a natural language description of a planning
problem, then returns a correct (or optimal) plan for solving that problem in
natural language. LLM+P does so by first converting the language description
into a file written in the planning domain definition language (PDDL), then
leveraging classical planners to quickly find a solution, and then translating
the found solution back into natural language. Along with LLM+P, we define a
diverse set of different benchmark problems taken from common planning
scenarios. Via a comprehensive set of experiments on these benchmark problems,
we find that LLM+P is able to provide optimal solutions for most problems,
while LLMs fail to provide even feasible plans for most problems.\footnote{The
code and results are publicly available at
https://github.com/Cranial-XIX/llm-pddl.git.
Related papers
- ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline [42.61538071832468]
Large language models (LLMs) have shown excellent mastering of human language, but still struggle in real-world applications that require mathematical problem-solving.
We tailor the Self-Critique pipeline, which addresses the challenge in the feedback learning stage of LLM alignment.
arXiv Detail & Related papers (2024-04-03T17:51:18Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Turbulence: Systematically and Automatically Testing Instruction-Tuned
Large Language Models for Code [12.58098809948832]
We present a method for evaluating the correctness and robustness of instruction-tuned large language models (LLMs) for code generation via a new benchmark, Turbulence.
Turbulence consists of a large set of natural language $textitquestion templates$, each of which is a programming problem, parameterised so that it can be asked in many different forms.
From a single question template, it is possible to ask an LLM a $textitneighbourhood$ of very similar programming questions, and assess the correctness of the result returned for each question.
arXiv Detail & Related papers (2023-12-22T17:29:08Z) - Understanding the Capabilities of Large Language Models for Automated
Planning [24.37599752610625]
The study seeks to shed light on the capabilities of LLMs in solving complex planning problems.
It provides insights into the most effective approaches for using LLMs in this context.
arXiv Detail & Related papers (2023-05-25T15:21:09Z) - Question Answering as Programming for Solving Time-Sensitive Questions [84.07553016489769]
Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world.
Recently, Large Language Models (LLMs) have shown remarkable intelligence in question answering.
This can be attributed to the LLMs' inability to perform rigorous reasoning based on surface-level text semantics.
We propose a novel approach where we reframe the $textbfQ$uestion $textbfA$rogrogering task.
arXiv Detail & Related papers (2023-05-23T16:35:16Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - Learning to Plan with Natural Language [111.76828049344839]
Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks.
For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step.
We propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.
arXiv Detail & Related papers (2023-04-20T17:09:12Z) - Translating Natural Language to Planning Goals with Large-Language
Models [19.738395237639136]
Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks.
Our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language.
Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning.
arXiv Detail & Related papers (2023-02-10T09:17:52Z) - PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems.
PaL offloads the solution step to a programmatic runtime such as a Python interpreter.
We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.