Related papers: PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

URL: http://arxiv.org/abs/2305.19472v3
Date: Wed, 18 Sep 2024 15:30:33 GMT
Title: PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning
Authors: Faeze Brahman, Chandra Bhagavatula, Valentina Pyatkin, Jena D. Hwang, Xiang Lorraine Li, Hirona J. Arai, Soumya Sanyal, Keisuke Sakaguchi, Xiang Ren, Yejin Choi,
Abstract summary: PlaSma is a novel two-pronged approach to endow small language models with procedural knowledge and (constrained) language planning capabilities. We develop symbolic procedural knowledge distillation to enhance the commonsense knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning.
Score: 77.03847056008598
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex and often contextualized situations, e.g. ``scheduling a doctor's appointment without a phone''. While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (constrained) language planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the commonsense knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a new related task, Replanning, that requires a revision of a plan to cope with a constrained situation. In both the planning and replanning settings, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities. Finally, we showcase successful application of PlaSma in an embodied environment, VirtualHome.

Related papers

Can LLM-Reasoning Models Replace Classical Planning? A Benchmark Study [0.0]
Large Language Models have sparked interest in their potential for robotic task planning.<n>While these models demonstrate strong generative capabilities, their effectiveness in producing structured and executable plans remains uncertain.<n>This paper presents a systematic evaluation of a broad spectrum of current state of the art language models.
arXiv Detail & Related papers (2025-07-31T14:25:54Z)
Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models. DisCIPL uses a Planner model to generate a task-specific inference program. Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z)
Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few Examples [17.372378259072992]
We propose FLARE (Few-shot Language with environmental Adaptive Replanning Embodied agent) to generate plans grounded in the environment. We additionally propose to correct the mistakes using visual cues from the agent. The proposed scheme allows us to use a few language pairs thanks to the visual cues and outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-23T05:20:01Z)
Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks. Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z)
Learning to Plan for Language Modeling from Unlabeled Data [23.042650737356496]
We train a module for planning the future writing process via a self-supervised learning objective. Given the textual context, this planning module learns to predict future abstract writing actions, which correspond to centroids in a clustered text embedding space.
arXiv Detail & Related papers (2024-03-31T09:04:01Z)
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset [0.0]
We present PARADISE, an abductive reasoning task using Q&A format on practical procedural text sourced from wikiHow. It involves warning and tip inference tasks directly associated with goals, excluding intermediary steps, with the aim of testing the ability of the models to infer implicit knowledge of the plan solely from the given goal. Our experiments, utilizing fine-tuned language models and zero-shot prompting, reveal the effectiveness of task-specific small models over large language models in most scenarios.
arXiv Detail & Related papers (2024-03-05T18:01:59Z)
Interactive Task Planning with Language Models [97.86399877812923]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. We propose a simple framework that achieves interactive task planning with language models.
arXiv Detail & Related papers (2023-10-16T17:59:12Z)
Statler: State-Maintaining Language Models for Embodied Reasoning [19.884696137429813]
We propose Statler, a framework in which large language models are prompted to maintain an estimate of the world state. Our framework then conditions each action on the estimate of the current world state. It significantly outperforms strong competing methods on several robot planning tasks.
arXiv Detail & Related papers (2023-06-30T17:58:02Z)
Improving Long-Horizon Imitation Through Instruction Prediction [93.47416552953075]
In this work, we explore the use of an often unused source of auxiliary supervision: language. Inspired by recent advances in transformer-based models, we train agents with an instruction prediction loss that encourages learning temporally extended representations that operate at a high level of abstraction. In further analysis we find that instruction modeling is most important for tasks that require complex reasoning, while understandably offering smaller gains in environments that require simple plans.
arXiv Detail & Related papers (2023-06-21T20:47:23Z)
Neuro-Symbolic Causal Language Planning with Commonsense Prompting [67.06667162430118]
Language planning aims to implement complex high-level goals by decomposition into simpler low-level steps. Previous methods require either manual exemplars or annotated programs to acquire such ability from large language models. This paper proposes Neuro-Symbolic Causal Language Planner (CLAP) that elicits procedural knowledge from the LLMs with commonsense-infused prompting.
arXiv Detail & Related papers (2022-06-06T22:09:52Z)
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm [0.0]
We discuss methods of prompt programming, emphasizing the usefulness of considering prompts through the lens of natural language. We introduce the idea of a metaprompt that seeds the model to generate its own natural language prompts for a range of tasks.
arXiv Detail & Related papers (2021-02-15T05:27:55Z)
STRIPS Action Discovery [67.73368413278631]
Recent approaches have shown the success of classical planning at synthesizing action models even when all intermediate states are missing. We propose a new algorithm to unsupervisedly synthesize STRIPS action models with a classical planner when action signatures are unknown.
arXiv Detail & Related papers (2020-01-30T17:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.