Related papers: SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

URL: http://arxiv.org/abs/2505.00831v4
Date: Sun, 11 May 2025 20:14:14 GMT
Title: SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation
Authors: Quang P. M. Pham, Khoi T. N. Nguyen, Nhi H. Doan, Cuong A. Pham, Kentaro Inui, Dezhen Song,
Abstract summary: SmallPlan is a novel framework leveraging Large Language Models as teacher models to train lightweight Small Language Models (SLMs) for high-level path planning tasks.<n>SLMs are trained in a simulation-powered, interleaved manner with LLM-guided supervised fine-tuning and reinforcement learning.<n>SmallPlan is resource-efficient, making it well-suited for edge-device deployment and advancing practical autonomous robotics.
Score: 20.743117921048537
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient path planning in robotics, particularly within large-scale, dynamic environments, remains a significant hurdle. While Large Language Models (LLMs) offer strong reasoning capabilities, their high computational cost and limited adaptability in dynamic scenarios hinder real-time deployment on edge devices. We present SmallPlan -- a novel framework leveraging LLMs as teacher models to train lightweight Small Language Models (SLMs) for high-level path planning tasks. In SmallPlan, the SLMs provide optimal action sequences to navigate across scene graphs that compactly represent full-scaled 3D scenes. The SLMs are trained in a simulation-powered, interleaved manner with LLM-guided supervised fine-tuning (SFT) and reinforcement learning (RL). This strategy not only enables SLMs to successfully complete navigation tasks but also makes them aware of important factors like travel distance and number of trials. Through experiments, we demonstrate that the fine-tuned SLMs perform competitively with larger models like GPT-4o on sequential path planning, without suffering from hallucination and overfitting. SmallPlan is resource-efficient, making it well-suited for edge-device deployment and advancing practical autonomous robotics. Our source code is available here: https://github.com/quangpham2006/SmallPlan

Related papers

Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners [38.407073503042966]
Large language models (LLMs) have demonstrated strong performance in various robot control tasks.<n>However, their deployment in real-world applications remains constrained.<n>We propose a novel framework that integrates reinforcement learning with verifiable rewards.
arXiv Detail & Related papers (2025-05-26T23:14:16Z)
Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models [63.765846080050906]
This paper proposes a novel parameter-efficient action planner using large language models (PEAP-LLM) to generate a single-step instruction at each location.<n>Experiments show the superiority of our proposed model on REVERIE compared to the previous state-of-the-art.
arXiv Detail & Related papers (2025-05-12T12:38:20Z)
LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language [17.914580097058106]
Bimanual robotic manipulation presents an inherent challenge due to the complexity involved in the spatial and temporal coordination between two hands.<n>Existing works predominantly focus on attaining human-level manipulation skills for robotic hands, yet little attention has been paid to task planning on long-horizon timescales.<n>This paper introduces LLM+MAP, a bimanual planning framework that integrates LLM reasoning and multi-agent planning.
arXiv Detail & Related papers (2025-03-21T17:04:01Z)
Dynamic Path Navigation for Motion Agents with LLM Reasoning [69.5875073447454]
Large Language Models (LLMs) have demonstrated strong generalizable reasoning and planning capabilities.<n>We explore the zero-shot navigation and path generation capabilities of LLMs by constructing a dataset and proposing an evaluation protocol.<n>We demonstrate that, when tasks are well-structured in this manner, modern LLMs exhibit substantial planning proficiency in avoiding obstacles while autonomously refining navigation with the generated motion to reach the target.
arXiv Detail & Related papers (2025-03-10T13:39:09Z)
Distilling Multi-modal Large Language Models for Autonomous Driving [64.63127269187814]
Recent end-to-end autonomous driving systems leverage large language models (LLMs) as planners to improve generalizability to rare events.<n>We propose DiMA, an end-to-end autonomous driving system that maintains the efficiency of an LLM-free (or vision-based) planner while leveraging the world knowledge of an LLM.<n>Training with DiMA results in a 37% reduction in the L2 trajectory error and an 80% reduction in the collision rate of the vision-based planner, as well as a 44% trajectory error reduction in longtail scenarios.
arXiv Detail & Related papers (2025-01-16T18:59:53Z)
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation. We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents. We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z)
Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models [6.860460230412773]
We propose an LLM-embodied path planning framework for mobile agents. Our proposed multi-layer architecture uses prompted LLMs in the path planning phase and integrates them with the mobile agents' low-level actuators. Our experiments show that this framework can improve LLMs' 2D plane reasoning abilities and complete coverage path planning tasks.
arXiv Detail & Related papers (2024-07-02T12:38:46Z)
Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks [50.27313829438866]
Plan-Seq-Learn (PSL) is a modular approach that uses motion planning to bridge the gap between abstract language and learned low-level control. PSL achieves success rates of over 85%, out-performing language-based, classical, and end-to-end approaches.
arXiv Detail & Related papers (2024-05-02T17:59:31Z)
Empowering Large Language Models on Robotic Manipulation with Affordance Prompting [23.318449345424725]
Large language models fail to interact with the physical world by generating control sequences properly. Existing LLM-based approaches circumvent this problem by relying on additional pre-defined skills or pre-trained sub-policies. We propose a framework called LLM+A(ffordance) where the LLM serves as both the sub-task planner and the motion controller.
arXiv Detail & Related papers (2024-04-17T03:06:32Z)
LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner. Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z)
AdaPlanner: Adaptive Planning from Feedback with Language Models [56.367020818139665]
Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks. We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback. To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities.
arXiv Detail & Related papers (2023-05-26T05:52:27Z)
AutoPlan: Automatic Planning of Interactive Decision-Making Tasks With Large Language Models [11.895111124804503]
AutoPlan is an approach to guide LLM-based agents to accomplish interactive decision-making tasks. Our experiments show that AutoPlan achieves success rates on par with the baselines.
arXiv Detail & Related papers (2023-05-24T11:52:23Z)
Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents [99.17668730578586]
Pre-trained large language models (LLMs) capture procedural knowledge about the world. Plan, Eliminate, and Track (PET) framework translates a task description into a list of high-level sub-tasks. PET framework leads to a significant 15% improvement over SOTA for generalization to human goal specifications.
arXiv Detail & Related papers (2023-05-03T20:11:22Z)
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models [27.318186938382233]
This study focuses on using large language models (LLMs) as a planner for embodied agents. We propose a novel method, LLM-Planner, that harnesses the power of large language models to do few-shot planning.
arXiv Detail & Related papers (2022-12-08T05:46:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.