Explicit Planning Helps Language Models in Logical Reasoning
- URL: http://arxiv.org/abs/2303.15714v4
- Date: Tue, 7 Nov 2023 18:12:20 GMT
- Title: Explicit Planning Helps Language Models in Logical Reasoning
- Authors: Hongyu Zhao, Kangrui Wang, Mo Yu, Hongyuan Mei
- Abstract summary: We propose LEAP, a novel system that uses language models to perform multi-step logical reasoning.
Explicit planning enables the system to make more informed reasoning decisions at each step.
Our system significantly outperforms other competing methods on multiple standard datasets.
- Score: 39.27163698914806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models have been shown to perform remarkably well on a wide range of
natural language processing tasks. In this paper, we propose LEAP, a novel
system that uses language models to perform multi-step logical reasoning and
incorporates explicit planning into the inference procedure. Explicit planning
enables the system to make more informed reasoning decisions at each step by
looking ahead into their future effects. Moreover, we propose a training
strategy that safeguards the planning process from being led astray by spurious
features. Our full system significantly outperforms other competing methods on
multiple standard datasets. When using small T5 models as its core selection
and deduction components, our system performs competitively compared to GPT-3
despite having only about 1B parameters (i.e., 175 times smaller than GPT-3).
When using GPT-3.5, it significantly outperforms chain-of-thought prompting on
the challenging PrOntoQA dataset. We have conducted extensive empirical studies
to demonstrate that explicit planning plays a crucial role in the system's
performance.
Related papers
- Heuristic-enhanced Candidates Selection strategy for GPTs tackle Few-Shot Aspect-Based Sentiment Analysis [1.5020330976600738]
The paper designs a Heuristic-enhanced Candidates Selection strategy and further proposes All in One (AiO) model based on it.
The model works in a two-stage, which simultaneously accommodates the accuracy of PLMs and the capability of generalization.
The experimental results demonstrate that the proposed model can better adapt to multiple sub-tasks, and also outperforms the methods that directly utilize GPTs.
arXiv Detail & Related papers (2024-04-09T07:02:14Z) - PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset [0.0]
We present PARADISE, an abductive reasoning task using Q&A format on practical procedural text sourced from wikiHow.
It involves warning and tip inference tasks directly associated with goals, excluding intermediary steps, with the aim of testing the ability of the models to infer implicit knowledge of the plan solely from the given goal.
Our experiments, utilizing fine-tuned language models and zero-shot prompting, reveal the effectiveness of task-specific small models over large language models in most scenarios.
arXiv Detail & Related papers (2024-03-05T18:01:59Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - On the Planning, Search, and Memorization Capabilities of Large Language
Models [0.0]
We investigate the potential of the state-of-the-art large language model (GPT-4) for planning tasks.
We identify areas where large language models excel in solving planning problems and reveal the constraints that limit their applicability.
arXiv Detail & Related papers (2023-09-05T00:19:31Z) - PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning [77.03847056008598]
PlaSma is a novel two-pronged approach to endow small language models with procedural knowledge and (constrained) language planning capabilities.
We develop symbolic procedural knowledge distillation to enhance the commonsense knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning.
arXiv Detail & Related papers (2023-05-31T00:55:40Z) - Large Language Models in the Workplace: A Case Study on Prompt
Engineering for Job Type Classification [58.720142291102135]
This case study investigates the task of job classification in a real-world setting.
The goal is to determine whether an English-language job posting is appropriate for a graduate or entry-level position.
arXiv Detail & Related papers (2023-03-13T14:09:53Z) - Reframing Instructional Prompts to GPTk's Language [72.69833640335519]
We propose reframing techniques for model designers to create effective prompts for language models.
Our results show that reframing improves few-shot learning performance by 14% while reducing sample complexity.
The performance gains are particularly important on large language models, such as GPT3 where tuning models or prompts on large datasets is not feasible.
arXiv Detail & Related papers (2021-09-16T09:44:43Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z) - Making Pre-trained Language Models Better Few-shot Learners [11.90626040104822]
Recent GPT-3 model achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context.
Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient.
We present LM-BFF--better few-shot fine-tuning of language models--a suite of simple and complementary techniques for fine-tuning language models on a small number of annotated examples.
arXiv Detail & Related papers (2020-12-31T17:21:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.