Constrained Natural Language Action Planning for Resilient Embodied Systems
- URL: http://arxiv.org/abs/2510.06357v1
- Date: Tue, 07 Oct 2025 18:23:12 GMT
- Title: Constrained Natural Language Action Planning for Resilient Embodied Systems
- Authors: Grayson Byrd, Corban Rivera, Bethany Kemp, Meghan Booker, Aurora Schmidt, Celso M de Melo, Lalithkumar Seenivasan, Mathias Unberath,
- Abstract summary: Large language models (LLMs) for task planning seek to address the previously intractable state/action space of complex planning tasks.<n>We introduce a new robotic planning method that augments LLM planners with symbolic planning oversight to improve reliability and repeatability.
- Score: 9.883720604599146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Replicating human-level intelligence in the execution of embodied tasks remains challenging due to the unconstrained nature of real-world environments. Novel use of large language models (LLMs) for task planning seeks to address the previously intractable state/action space of complex planning tasks, but hallucinations limit their reliability, and thus, viability beyond a research context. Additionally, the prompt engineering required to achieve adequate system performance lacks transparency, and thus, repeatability. In contrast to LLM planning, symbolic planning methods offer strong reliability and repeatability guarantees, but struggle to scale to the complexity and ambiguity of real-world tasks. We introduce a new robotic planning method that augments LLM planners with symbolic planning oversight to improve reliability and repeatability, and provide a transparent approach to defining hard constraints with considerably stronger clarity than traditional prompt engineering. Importantly, these augmentations preserve the reasoning capabilities of LLMs and retain impressive generalization in open-world environments. We demonstrate our approach in simulated and real-world environments. On the ALFWorld planning benchmark, our approach outperforms current state-of-the-art methods, achieving a near-perfect 99% success rate. Deployment of our method to a real-world quadruped robot resulted in 100% task success compared to 50% and 30% for pure LLM and symbolic planners across embodied pick and place tasks. Our approach presents an effective strategy to enhance the reliability, repeatability and transparency of LLM-based robot planners while retaining their key strengths: flexibility and generalizability to complex real-world environments. We hope that this work will contribute to the broad goal of building resilient embodied intelligent systems.
Related papers
- LLM-Grounded Dynamic Task Planning with Hierarchical Temporal Logic for Human-Aware Multi-Robot Collaboration [17.886091169216538]
Large Language Models (LLM) enable non-experts to specify openworld multi-robot tasks.<n>LLM plans often lack feasibility and are not efficient, especially in long-horizon scenarios.<n>We propose a neuro-symbolic framework that grounds reasoning into hierarchical specifications.
arXiv Detail & Related papers (2026-02-10T07:11:36Z) - Grounded in Reality: Learning and Deploying Proactive LLM from Offline Logs [72.08224879435762]
textttLearn-to-Ask is a simulator-free framework for learning and deploying proactive dialogue agents.<n>Our approach culminates in the successful deployment of LLMs into a live, large-scale online AI service.
arXiv Detail & Related papers (2025-10-29T12:08:07Z) - Hierarchical Planning for Complex Tasks with Knowledge Graph-RAG and Symbolic Verification [5.727096041675994]
Large Language Models (LLMs) have shown promise as robotic planners but often struggle with long-horizon and complex tasks.<n>We propose a neuro-symbolic approach that enhances LLMs-based planners with Knowledge Graph-based RAG for hierarchical plan generation.
arXiv Detail & Related papers (2025-04-06T18:36:30Z) - Instruction-Augmented Long-Horizon Planning: Embedding Grounding Mechanisms in Embodied Mobile Manipulation [39.43049944895508]
We present the Instruction-Augmented Long-Horizon Planning (IALP) system, which generates feasible and optimal actions based on real-time sensor feedback.<n>Our results demonstrate that the IALP system can efficiently solve tasks with an average success rate exceeding 80%.
arXiv Detail & Related papers (2025-03-11T06:37:33Z) - Think Small, Plan Smart: Minimalist Symbolic Abstraction and Heuristic Subspace Search for LLM-Guided Task Planning [19.421916137269275]
Large language models (LLMs) offer a promising interface for translating complex and ambiguous natural language instructions into actionable plans.<n>Recent frameworks combine LLMs with symbolic planners by first generating action models (Planning Domain Definition Language) and then applying search.<n>We propose PLAHX, a two-stage LLM-symbolic planning framework that integrates abstract symbolic representations with meta-heuristic subspace search in a parallel and iterative fashion.
arXiv Detail & Related papers (2025-01-25T13:33:22Z) - Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
This work advances model-based reinforcement learning by addressing the challenges of long-horizon prediction, error accumulation, and sim-to-real transfer.<n>By providing a scalable and robust framework, the introduced methods pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios [53.26658545922884]
We introduce EgoPlan-Bench2, a benchmark designed to assess the planning capabilities of MLLMs across a wide range of real-world scenarios.<n>We evaluate 21 competitive MLLMs and provide an in-depth analysis of their limitations, revealing that they face significant challenges in real-world planning.<n>Our approach enhances the performance of GPT-4V by 10.24 on EgoPlan-Bench2 without additional training.
arXiv Detail & Related papers (2024-12-05T18:57:23Z) - MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [62.854649499866774]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.<n>We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.<n>We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity [0.659529078336196]
Large language models (LLMs) exhibit advanced reasoning skills, enabling robots to comprehend natural language instructions and strategically plan high-level actions.<n>LLMs hallucination may result in robots confidently executing plans that are misaligned with user goals or even unsafe in critical scenarios.<n>We propose introspective planning, a systematic approach that align LLM's uncertainty with the inherent ambiguity of the task.
arXiv Detail & Related papers (2024-02-09T16:40:59Z) - EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning [84.6451394629312]
We introduce EgoPlan-Bench, a benchmark to evaluate the planning abilities of MLLMs in real-world scenarios.
We show that EgoPlan-Bench poses significant challenges, highlighting a substantial scope for improvement in MLLMs to achieve human-level task planning.
We also present EgoPlan-IT, a specialized instruction-tuning dataset that effectively enhances model performance on EgoPlan-Bench.
arXiv Detail & Related papers (2023-12-11T03:35:58Z) - DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation [107.5934592892763]
We propose DREAMWALKER -- a world model based VLN-CE agent.
The world model is built to summarize the visual, topological, and dynamic properties of the complicated continuous environment.
It can simulate and evaluate possible plans entirely in such internal abstract world, before executing costly actions.
arXiv Detail & Related papers (2023-08-14T23:45:01Z) - Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL.
We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.