Related papers: Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning

Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning

URL: http://arxiv.org/abs/2510.21302v1
Date: Fri, 24 Oct 2025 10:01:08 GMT
Title: Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning
Authors: Sanghyun Ahn, Wonje Choi, Junyong Lee, Jinwoo Park, Honguk Woo,
Abstract summary: We propose a neuro-symbolic embodied task planning framework that incorporates explicit symbolic verification and interactive validation processes during code generation.<n>We evaluate our framework on RLBench and in real-world settings across dynamic, partially observable scenarios.
Score: 25.860785629018356
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in large language models (LLMs) have enabled the automatic generation of executable code for task planning and control in embodied agents such as robots, demonstrating the potential of LLM-based embodied intelligence. However, these LLM-based code-as-policies approaches often suffer from limited environmental grounding, particularly in dynamic or partially observable settings, leading to suboptimal task success rates due to incorrect or incomplete code generation. In this work, we propose a neuro-symbolic embodied task planning framework that incorporates explicit symbolic verification and interactive validation processes during code generation. In the validation phase, the framework generates exploratory code that actively interacts with the environment to acquire missing observations while preserving task-relevant states. This integrated process enhances the grounding of generated code, resulting in improved task reliability and success rates in complex environments. We evaluate our framework on RLBench and in real-world settings across dynamic, partially observable scenarios. Experimental results demonstrate that our framework improves task success rates by 46.2% over Code-as-Policies baselines and attains over 86.8% executability of task-relevant actions, thereby enhancing the reliability of task planning in dynamic environments.

Related papers

Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following [30.757285244293794]
This study presents an Exploratory Retrieval-Augmented Planning (ExRAP) framework, designed to tackle continual instruction following tasks of embodied agents in dynamic, non-stationary environments.<n>The framework enhances Large Language Models' embodied reasoning capabilities by efficiently exploring the physical environment and establishing the environmental context memory.<n>It consistently outperforms other state-of-the-art LLM-based task planning approaches in terms of both goal success rate and execution efficiency.
arXiv Detail & Related papers (2025-09-10T01:39:51Z)
Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach [82.27842884709378]
We propose a framework that prioritizes natural language understanding and structured reasoning to enhance the agent's global understanding of the environment.<n>Our method outperforms previous approaches, particularly achieving a 44.4% relative improvement in task success rate.
arXiv Detail & Related papers (2025-05-22T09:08:47Z)
Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution [18.68431625184045]
GoalAct is a novel agent framework that introduces a continuously updated global planning mechanism and integrates a hierarchical execution strategy.<n>GoalAct decomposes task execution into high-level skills, including searching, coding, writing and more.<n>We evaluate GoalAct on LegalAgentBench, a benchmark with multiple types of legal tasks that require the use of multiple types of tools.
arXiv Detail & Related papers (2025-04-23T09:43:40Z)
CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments [39.5949489828609]
Large Language Models (LLMs) exhibit remarkable capabilities in the hierarchical decomposition of complex tasks through semantic reasoning.<n>We propose Closed-Loop Embodied Agent (CLEA) -- a novel architecture incorporating four specialized open-source LLMs with functional decoupling for closed-loop task management.<n>We conduct experiments in a real environment with manipulable objects, using two heterogeneous robots for object search, manipulation, and search-manipulation integration tasks.
arXiv Detail & Related papers (2025-03-02T04:50:59Z)
DynaSaur: Large Language Agents Beyond Predefined Actions [126.98162266986554]
Existing LLM agent systems typically select actions from a fixed and predefined set at every step.<n>We propose an LLM agent framework that can dynamically create and compose actions as needed.<n>In this framework, the agent interacts with its environment by generating and executing programs written in a general-purpose programming language.
arXiv Detail & Related papers (2024-11-04T02:08:59Z)
R-AIF: Solving Sparse-Reward Robotic Tasks from Pixels with Active Inference and World Models [50.19174067263255]
We introduce prior preference learning techniques and self-revision schedules to help the agent excel in sparse-reward, continuous action, goal-based robotic control POMDP environments. We show that our agents offer improved performance over state-of-the-art models in terms of cumulative rewards, relative stability, and success rate.
arXiv Detail & Related papers (2024-09-21T18:32:44Z)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes. CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks. It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z)
Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence. This paper uncovers a significant backdoor security threat within this process. By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z)
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation [81.32722475387364]
Large Language Model-based agents have garnered significant attention and are becoming increasingly popular.<n>Planning ability is a crucial component of an LLM-based agent, which generally entails achieving a desired goal from an initial state.<n>Recent studies have demonstrated that utilizing expert-level trajectory for instruction-tuning LLMs effectively enhances their planning capabilities.
arXiv Detail & Related papers (2024-08-01T17:59:46Z)
Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents [9.529492371336286]
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors. We propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS) LSTS learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification.
arXiv Detail & Related papers (2024-02-06T04:00:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.